{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yfv52r2G33jY"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/AI4Finance-Foundation/FinRL-Tutorials/blob/master/1-Introduction/Stock_NeurIPS2018_call_func_rolling_window_SB3.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "gXaoZs2lh1hi"
      },
      "source": [
        "# Deep Reinforcement Learning for Stock Trading from Scratch: Multiple Stock Trading\n",
        "\n",
        "* **Pytorch Version** \n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "lGunVt8oLCVS"
      },
      "source": [
        "# Content"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "sApkDlD9LIZv"
      },
      "source": [
        "<a id='0'></a>\n",
        "Task Discription"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HjLD2TZSLKZ-"
      },
      "source": [
        "We train a DRL agent for stock trading. This task is modeled as a Markov Decision Process (MDP), and the objective function is maximizing (expected) cumulative return.\n",
        "\n",
        "We specify the state-action-reward as follows:\n",
        "\n",
        "* **State s**: The state space represents an agent's perception of the market environment. Just like a human trader analyzing various information, here our agent passively observes many features and learns by interacting with the market environment (usually by replaying historical data).\n",
        "\n",
        "* **Action a**: The action space includes allowed actions that an agent can take at each state. For example, a ∈ {−1, 0, 1}, where −1, 0, 1 represent\n",
        "selling, holding, and buying. When an action operates multiple shares, a ∈{−k, ..., −1, 0, 1, ..., k}, e.g.. \"Buy\n",
        "10 shares of AAPL\" or \"Sell 10 shares of AAPL\" are 10 or −10, respectively\n",
        "\n",
        "* **Reward function r(s, a, s′)**: Reward is an incentive for an agent to learn a better policy. For example, it can be the change of the portfolio value when taking a at state s and arriving at new state s',  i.e., r(s, a, s′) = v′ − v, where v′ and v represent the portfolio values at state s′ and s, respectively\n",
        "\n",
        "\n",
        "**Market environment**: 30 consituent stocks of Dow Jones Industrial Average (DJIA) index. Accessed at the starting date of the testing period.\n",
        "\n",
        "\n",
        "The data for this case study is obtained from Yahoo Finance API. The data contains Open-High-Low-Close price and volume.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Ffsre789LY08"
      },
      "source": [
        "<a id='1'></a>\n",
        "# Part 1. Install Python Packages"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Uy5_PTmOh1hj"
      },
      "source": [
        "<a id='1.1'></a>\n",
        "## 1.1. Install packages\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 1000
        },
        "id": "mPT0ipYE28wL",
        "outputId": "1cb18b7e-62aa-47de-b1e7-c37711d49555"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n",
            "Collecting swig\n",
            "  Downloading swig-4.1.1-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.8 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.8/1.8 MB\u001b[0m \u001b[31m23.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hInstalling collected packages: swig\n",
            "Successfully installed swig-4.1.1\n",
            "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n",
            "Collecting wrds\n",
            "  Downloading wrds-3.1.6-py3-none-any.whl (12 kB)\n",
            "Collecting psycopg2-binary\n",
            "  Downloading psycopg2_binary-2.9.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.0/3.0 MB\u001b[0m \u001b[31m47.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hRequirement already satisfied: scipy in /usr/local/lib/python3.9/dist-packages (from wrds) (1.10.1)\n",
            "Requirement already satisfied: pandas in /usr/local/lib/python3.9/dist-packages (from wrds) (1.4.4)\n",
            "Requirement already satisfied: sqlalchemy<2 in /usr/local/lib/python3.9/dist-packages (from wrds) (1.4.47)\n",
            "Requirement already satisfied: numpy in /usr/local/lib/python3.9/dist-packages (from wrds) (1.22.4)\n",
            "Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.9/dist-packages (from sqlalchemy<2->wrds) (2.0.2)\n",
            "Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.9/dist-packages (from pandas->wrds) (2.8.2)\n",
            "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.9/dist-packages (from pandas->wrds) (2022.7.1)\n",
            "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.9/dist-packages (from python-dateutil>=2.8.1->pandas->wrds) (1.16.0)\n",
            "Installing collected packages: psycopg2-binary, wrds\n",
            "Successfully installed psycopg2-binary-2.9.5 wrds-3.1.6\n",
            "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n",
            "Collecting pyportfolioopt\n",
            "  Downloading pyportfolioopt-1.5.4-py3-none-any.whl (61 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m61.9/61.9 KB\u001b[0m \u001b[31m3.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hRequirement already satisfied: numpy<2.0.0,>=1.22.4 in /usr/local/lib/python3.9/dist-packages (from pyportfolioopt) (1.22.4)\n",
            "Requirement already satisfied: pandas>=0.19 in /usr/local/lib/python3.9/dist-packages (from pyportfolioopt) (1.4.4)\n",
            "Requirement already satisfied: scipy<2.0,>=1.3 in /usr/local/lib/python3.9/dist-packages (from pyportfolioopt) (1.10.1)\n",
            "Requirement already satisfied: cvxpy<2.0.0,>=1.1.10 in /usr/local/lib/python3.9/dist-packages (from pyportfolioopt) (1.3.1)\n",
            "Requirement already satisfied: ecos>=2 in /usr/local/lib/python3.9/dist-packages (from cvxpy<2.0.0,>=1.1.10->pyportfolioopt) (2.0.12)\n",
            "Requirement already satisfied: scs>=1.1.6 in /usr/local/lib/python3.9/dist-packages (from cvxpy<2.0.0,>=1.1.10->pyportfolioopt) (3.2.2)\n",
            "Requirement already satisfied: osqp>=0.4.1 in /usr/local/lib/python3.9/dist-packages (from cvxpy<2.0.0,>=1.1.10->pyportfolioopt) (0.6.2.post0)\n",
            "Requirement already satisfied: setuptools>65.5.1 in /usr/local/lib/python3.9/dist-packages (from cvxpy<2.0.0,>=1.1.10->pyportfolioopt) (67.6.1)\n",
            "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.9/dist-packages (from pandas>=0.19->pyportfolioopt) (2022.7.1)\n",
            "Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.9/dist-packages (from pandas>=0.19->pyportfolioopt) (2.8.2)\n",
            "Requirement already satisfied: qdldl in /usr/local/lib/python3.9/dist-packages (from osqp>=0.4.1->cvxpy<2.0.0,>=1.1.10->pyportfolioopt) (0.1.5.post3)\n",
            "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.9/dist-packages (from python-dateutil>=2.8.1->pandas>=0.19->pyportfolioopt) (1.16.0)\n",
            "Installing collected packages: pyportfolioopt\n",
            "Successfully installed pyportfolioopt-1.5.4\n",
            "⏬ Downloading https://github.com/jaimergp/miniforge/releases/latest/download/Mambaforge-colab-Linux-x86_64.sh...\n",
            "📦 Installing...\n",
            "📌 Adjusting configuration...\n",
            "🩹 Patching environment...\n",
            "⏲ Done in 0:00:19\n",
            "🔁 Restarting kernel...\n",
            "Selecting previously unselected package libgl1-mesa-glx:amd64.\n",
            "(Reading database ... 128288 files and directories currently installed.)\n",
            "Preparing to unpack .../libgl1-mesa-glx_21.2.6-0ubuntu0.1~20.04.2_amd64.deb ...\n",
            "Unpacking libgl1-mesa-glx:amd64 (21.2.6-0ubuntu0.1~20.04.2) ...\n",
            "Selecting previously unselected package swig4.0.\n",
            "Preparing to unpack .../swig4.0_4.0.1-5build1_amd64.deb ...\n",
            "Unpacking swig4.0 (4.0.1-5build1) ...\n",
            "Selecting previously unselected package swig.\n",
            "Preparing to unpack .../swig_4.0.1-5build1_all.deb ...\n",
            "Unpacking swig (4.0.1-5build1) ...\n",
            "Setting up libgl1-mesa-glx:amd64 (21.2.6-0ubuntu0.1~20.04.2) ...\n",
            "Setting up swig4.0 (4.0.1-5build1) ...\n",
            "Setting up swig (4.0.1-5build1) ...\n",
            "Processing triggers for man-db (2.9.1-1) ...\n",
            "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n",
            "Collecting git+https://github.com/AI4Finance-Foundation/FinRL.git\n",
            "  Cloning https://github.com/AI4Finance-Foundation/FinRL.git to /tmp/pip-req-build-zsjn96ks\n",
            "  Running command git clone --filter=blob:none --quiet https://github.com/AI4Finance-Foundation/FinRL.git /tmp/pip-req-build-zsjn96ks\n",
            "  Resolved https://github.com/AI4Finance-Foundation/FinRL.git to commit d3e35c7d94da2b0b4f44734a226de360b5c09d52\n",
            "  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
            "  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
            "  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
            "Collecting elegantrl@ git+https://github.com/AI4Finance-Foundation/ElegantRL.git#egg=elegantrl\n",
            "  Cloning https://github.com/AI4Finance-Foundation/ElegantRL.git to /tmp/pip-install-nhzd12k3/elegantrl_0fdb0f9b53ed4e8aac03822e093bbd98\n",
            "  Running command git clone --filter=blob:none --quiet https://github.com/AI4Finance-Foundation/ElegantRL.git /tmp/pip-install-nhzd12k3/elegantrl_0fdb0f9b53ed4e8aac03822e093bbd98\n",
            "  Resolved https://github.com/AI4Finance-Foundation/ElegantRL.git to commit 594b0c31de443a24c1032b75418fdc134664e92f\n",
            "  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "Collecting pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2\n",
            "  Cloning https://github.com/quantopian/pyfolio.git to /tmp/pip-install-nhzd12k3/pyfolio_26e97b504ff64373aaa7bb9e77b71a1d\n",
            "  Running command git clone --filter=blob:none --quiet https://github.com/quantopian/pyfolio.git /tmp/pip-install-nhzd12k3/pyfolio_26e97b504ff64373aaa7bb9e77b71a1d\n",
            "  Resolved https://github.com/quantopian/pyfolio.git to commit 4b901f6d73aa02ceb6d04b7d83502e5c6f2e81aa\n",
            "  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "Collecting stockstats>=0.4.0\n",
            "  Downloading stockstats-0.5.2-py2.py3-none-any.whl (20 kB)\n",
            "Collecting lz4\n",
            "  Downloading lz4-4.3.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.3/1.3 MB\u001b[0m \u001b[31m12.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting scikit-learn>=0.21.0\n",
            "  Downloading scikit_learn-1.2.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.6 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m9.6/9.6 MB\u001b[0m \u001b[31m31.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting gputil\n",
            "  Downloading GPUtil-1.4.0.tar.gz (5.5 kB)\n",
            "  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "Collecting stable-baselines3<2.0.0,>=1.6.2\n",
            "  Downloading stable_baselines3-1.7.0-py3-none-any.whl (171 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m171.8/171.8 kB\u001b[0m \u001b[31m15.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting jqdatasdk\n",
            "  Downloading jqdatasdk-1.8.11-py3-none-any.whl (158 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m158.2/158.2 kB\u001b[0m \u001b[31m16.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting alpaca_trade_api>=2.1.0\n",
            "  Downloading alpaca_trade_api-3.0.0-py3-none-any.whl (33 kB)\n",
            "Collecting yfinance\n",
            "  Downloading yfinance-0.2.14-py2.py3-none-any.whl (59 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m59.7/59.7 kB\u001b[0m \u001b[31m6.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting ccxt>=1.66.32\n",
            "  Downloading ccxt-3.0.46-py2.py3-none-any.whl (3.6 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.6/3.6 MB\u001b[0m \u001b[31m49.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting pandas>=1.1.5\n",
            "  Downloading pandas-1.5.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.2 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m12.2/12.2 MB\u001b[0m \u001b[31m54.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting tensorboardX\n",
            "  Downloading tensorboardX-2.6-py2.py3-none-any.whl (114 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m114.5/114.5 kB\u001b[0m \u001b[31m12.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting wrds>=3.1.6\n",
            "  Using cached wrds-3.1.6-py3-none-any.whl (12 kB)\n",
            "Collecting numpy>=1.17.3\n",
            "  Downloading numpy-1.24.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m17.3/17.3 MB\u001b[0m \u001b[31m43.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting gym>=0.17\n",
            "  Downloading gym-0.26.2.tar.gz (721 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m721.7/721.7 kB\u001b[0m \u001b[31m37.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
            "  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
            "  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
            "Collecting exchange_calendars==3.6.3\n",
            "  Downloading exchange_calendars-3.6.3.tar.gz (152 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m152.8/152.8 kB\u001b[0m \u001b[31m12.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "Collecting ray[default,tune]>=2.0.0\n",
            "  Downloading ray-2.3.1-cp39-cp39-manylinux2014_x86_64.whl (58.6 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m58.6/58.6 MB\u001b[0m \u001b[31m11.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting matplotlib\n",
            "  Downloading matplotlib-3.7.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.6 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m11.6/11.6 MB\u001b[0m \u001b[31m73.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting importlib-metadata==4.13.0\n",
            "  Downloading importlib_metadata-4.13.0-py3-none-any.whl (23 kB)\n",
            "Collecting pyluach\n",
            "  Downloading pyluach-2.2.0-py3-none-any.whl (25 kB)\n",
            "Collecting python-dateutil\n",
            "  Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m247.7/247.7 kB\u001b[0m \u001b[31m22.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting pytz\n",
            "  Downloading pytz-2023.3-py2.py3-none-any.whl (502 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m502.3/502.3 kB\u001b[0m \u001b[31m31.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hRequirement already satisfied: toolz in /usr/local/lib/python3.9/site-packages (from exchange_calendars==3.6.3->finrl==0.3.5) (0.12.0)\n",
            "Collecting korean_lunar_calendar\n",
            "  Downloading korean_lunar_calendar-0.3.1-py3-none-any.whl (9.0 kB)\n",
            "Collecting zipp>=0.5\n",
            "  Downloading zipp-3.15.0-py3-none-any.whl (6.8 kB)\n",
            "Collecting msgpack==1.0.3\n",
            "  Downloading msgpack-1.0.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (322 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m322.2/322.2 kB\u001b[0m \u001b[31m26.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting websocket-client<2,>=0.56.0\n",
            "  Downloading websocket_client-1.5.1-py3-none-any.whl (55 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m55.9/55.9 kB\u001b[0m \u001b[31m6.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting deprecation==2.1.0\n",
            "  Downloading deprecation-2.1.0-py2.py3-none-any.whl (11 kB)\n",
            "Requirement already satisfied: requests<3,>2 in /usr/local/lib/python3.9/site-packages (from alpaca_trade_api>=2.1.0->finrl==0.3.5) (2.28.2)\n",
            "Collecting websockets<11,>=9.0\n",
            "  Downloading websockets-10.4-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (106 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m106.5/106.5 kB\u001b[0m \u001b[31m9.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting PyYAML==6.0\n",
            "  Downloading PyYAML-6.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (661 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m661.8/661.8 kB\u001b[0m \u001b[31m41.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting aiohttp==3.8.1\n",
            "  Downloading aiohttp-3.8.1-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.2 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.2/1.2 MB\u001b[0m \u001b[31m59.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hRequirement already satisfied: urllib3<2,>1.24 in /usr/local/lib/python3.9/site-packages (from alpaca_trade_api>=2.1.0->finrl==0.3.5) (1.26.15)\n",
            "Collecting aiosignal>=1.1.2\n",
            "  Downloading aiosignal-1.3.1-py3-none-any.whl (7.6 kB)\n",
            "Requirement already satisfied: charset-normalizer<3.0,>=2.0 in /usr/local/lib/python3.9/site-packages (from aiohttp==3.8.1->alpaca_trade_api>=2.1.0->finrl==0.3.5) (2.1.1)\n",
            "Collecting multidict<7.0,>=4.5\n",
            "  Downloading multidict-6.0.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (114 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m114.2/114.2 kB\u001b[0m \u001b[31m14.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting async-timeout<5.0,>=4.0.0a3\n",
            "  Downloading async_timeout-4.0.2-py3-none-any.whl (5.8 kB)\n",
            "Collecting frozenlist>=1.1.1\n",
            "  Downloading frozenlist-1.3.3-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (158 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m158.8/158.8 kB\u001b[0m \u001b[31m14.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting yarl<2.0,>=1.0\n",
            "  Downloading yarl-1.8.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (264 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m264.6/264.6 kB\u001b[0m \u001b[31m20.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting attrs>=17.3.0\n",
            "  Downloading attrs-22.2.0-py3-none-any.whl (60 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m60.0/60.0 kB\u001b[0m \u001b[31m6.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting packaging\n",
            "  Downloading packaging-23.0-py3-none-any.whl (42 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m42.7/42.7 kB\u001b[0m \u001b[31m4.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hRequirement already satisfied: setuptools>=60.9.0 in /usr/local/lib/python3.9/site-packages (from ccxt>=1.66.32->finrl==0.3.5) (65.6.3)\n",
            "Collecting aiodns>=1.1.1\n",
            "  Downloading aiodns-3.0.0-py3-none-any.whl (5.0 kB)\n",
            "Requirement already satisfied: certifi>=2018.1.18 in /usr/local/lib/python3.9/site-packages (from ccxt>=1.66.32->finrl==0.3.5) (2022.12.7)\n",
            "Requirement already satisfied: cryptography>=2.6.1 in /usr/local/lib/python3.9/site-packages (from ccxt>=1.66.32->finrl==0.3.5) (39.0.2)\n",
            "Collecting cloudpickle>=1.2.0\n",
            "  Downloading cloudpickle-2.2.1-py3-none-any.whl (25 kB)\n",
            "Collecting gym-notices>=0.0.4\n",
            "  Downloading gym_notices-0.0.8-py3-none-any.whl (3.0 kB)\n",
            "Collecting click>=7.0\n",
            "  Downloading click-8.1.3-py3-none-any.whl (96 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m96.6/96.6 kB\u001b[0m \u001b[31m9.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting virtualenv>=20.0.24\n",
            "  Downloading virtualenv-20.21.0-py3-none-any.whl (8.7 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m8.7/8.7 MB\u001b[0m \u001b[31m55.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting protobuf!=3.19.5,>=3.15.3\n",
            "  Downloading protobuf-4.22.1-cp37-abi3-manylinux2014_x86_64.whl (302 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m302.4/302.4 kB\u001b[0m \u001b[31m22.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting grpcio>=1.32.0\n",
            "  Downloading grpcio-1.53.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.0 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m5.0/5.0 MB\u001b[0m \u001b[31m54.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting jsonschema\n",
            "  Downloading jsonschema-4.17.3-py3-none-any.whl (90 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m90.4/90.4 kB\u001b[0m \u001b[31m8.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting filelock\n",
            "  Downloading filelock-3.10.7-py3-none-any.whl (10 kB)\n",
            "Collecting tabulate\n",
            "  Downloading tabulate-0.9.0-py3-none-any.whl (35 kB)\n",
            "Collecting py-spy>=0.2.0\n",
            "  Downloading py_spy-0.3.14-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (3.0 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.0/3.0 MB\u001b[0m \u001b[31m52.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting prometheus-client>=0.7.1\n",
            "  Downloading prometheus_client-0.16.0-py3-none-any.whl (122 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m122.5/122.5 kB\u001b[0m \u001b[31m12.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting gpustat>=1.0.0\n",
            "  Downloading gpustat-1.0.0.tar.gz (90 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m90.5/90.5 kB\u001b[0m \u001b[31m9.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "Collecting aiohttp-cors\n",
            "  Downloading aiohttp_cors-0.7.0-py3-none-any.whl (27 kB)\n",
            "Collecting colorful\n",
            "  Downloading colorful-0.5.5-py2.py3-none-any.whl (201 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m201.4/201.4 kB\u001b[0m \u001b[31m21.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting opencensus\n",
            "  Downloading opencensus-0.11.2-py2.py3-none-any.whl (128 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m128.2/128.2 kB\u001b[0m \u001b[31m15.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting smart-open\n",
            "  Downloading smart_open-6.3.0-py3-none-any.whl (56 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m56.8/56.8 kB\u001b[0m \u001b[31m6.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting pydantic\n",
            "  Downloading pydantic-1.10.7-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.2 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.2/3.2 MB\u001b[0m \u001b[31m62.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting threadpoolctl>=2.0.0\n",
            "  Downloading threadpoolctl-3.1.0-py3-none-any.whl (14 kB)\n",
            "Collecting joblib>=1.1.1\n",
            "  Downloading joblib-1.2.0-py3-none-any.whl (297 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m298.0/298.0 kB\u001b[0m \u001b[31m28.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting scipy>=1.3.2\n",
            "  Downloading scipy-1.10.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.5 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m34.5/34.5 MB\u001b[0m \u001b[31m21.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting gym>=0.17\n",
            "  Downloading gym-0.21.0.tar.gz (1.5 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.5/1.5 MB\u001b[0m \u001b[31m63.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "Collecting torch>=1.11\n",
            "  Downloading torch-2.0.0-cp39-cp39-manylinux1_x86_64.whl (619.9 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m619.9/619.9 MB\u001b[0m \u001b[31m1.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting protobuf!=3.19.5,>=3.15.3\n",
            "  Downloading protobuf-3.20.3-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.0 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.0/1.0 MB\u001b[0m \u001b[31m52.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting psycopg2-binary\n",
            "  Using cached psycopg2_binary-2.9.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)\n",
            "Collecting sqlalchemy<2\n",
            "  Downloading SQLAlchemy-1.4.47-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.6/1.6 MB\u001b[0m \u001b[31m70.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting six\n",
            "  Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)\n",
            "Collecting thriftpy2>=0.3.9\n",
            "  Downloading thriftpy2-0.4.16.tar.gz (643 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m643.4/643.4 kB\u001b[0m \u001b[31m39.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "Collecting pymysql>=0.7.6\n",
            "  Downloading PyMySQL-1.0.3-py3-none-any.whl (43 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m43.7/43.7 kB\u001b[0m \u001b[31m4.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting kiwisolver>=1.0.1\n",
            "  Downloading kiwisolver-1.4.4-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.6 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.6/1.6 MB\u001b[0m \u001b[31m67.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting fonttools>=4.22.0\n",
            "  Downloading fonttools-4.39.3-py3-none-any.whl (1.0 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.0/1.0 MB\u001b[0m \u001b[31m55.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting pillow>=6.2.0\n",
            "  Downloading Pillow-9.4.0-cp39-cp39-manylinux_2_28_x86_64.whl (3.4 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.4/3.4 MB\u001b[0m \u001b[31m79.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting pyparsing>=2.3.1\n",
            "  Downloading pyparsing-3.0.9-py3-none-any.whl (98 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m98.3/98.3 kB\u001b[0m \u001b[31m10.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting importlib-resources>=3.2.0\n",
            "  Downloading importlib_resources-5.12.0-py3-none-any.whl (36 kB)\n",
            "Collecting contourpy>=1.0.1\n",
            "  Downloading contourpy-1.0.7-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (299 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m299.7/299.7 kB\u001b[0m \u001b[31m25.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting cycler>=0.10\n",
            "  Downloading cycler-0.11.0-py3-none-any.whl (6.4 kB)\n",
            "Collecting ipython>=3.2.3\n",
            "  Downloading ipython-8.12.0-py3-none-any.whl (796 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m796.4/796.4 kB\u001b[0m \u001b[31m49.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting seaborn>=0.7.1\n",
            "  Downloading seaborn-0.12.2-py3-none-any.whl (293 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m293.3/293.3 kB\u001b[0m \u001b[31m25.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting empyrical>=0.5.0\n",
            "  Downloading empyrical-0.5.5.tar.gz (52 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m52.8/52.8 kB\u001b[0m \u001b[31m6.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "Collecting appdirs>=1.4.4\n",
            "  Downloading appdirs-1.4.4-py2.py3-none-any.whl (9.6 kB)\n",
            "Collecting beautifulsoup4>=4.11.1\n",
            "  Downloading beautifulsoup4-4.12.0-py3-none-any.whl (132 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m132.2/132.2 kB\u001b[0m \u001b[31m10.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting html5lib>=1.1\n",
            "  Downloading html5lib-1.1-py2.py3-none-any.whl (112 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m112.2/112.2 kB\u001b[0m \u001b[31m14.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting frozendict>=2.3.4\n",
            "  Downloading frozendict-2.3.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (114 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m114.3/114.3 kB\u001b[0m \u001b[31m14.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting multitasking>=0.0.7\n",
            "  Downloading multitasking-0.0.11-py3-none-any.whl (8.5 kB)\n",
            "Collecting lxml>=4.9.1\n",
            "  Downloading lxml-4.9.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (7.1 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m7.1/7.1 MB\u001b[0m \u001b[31m83.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting pycares>=4.0.0\n",
            "  Downloading pycares-4.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (288 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m288.7/288.7 kB\u001b[0m \u001b[31m22.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting soupsieve>1.2\n",
            "  Downloading soupsieve-2.4-py3-none-any.whl (37 kB)\n",
            "Requirement already satisfied: cffi>=1.12 in /usr/local/lib/python3.9/site-packages (from cryptography>=2.6.1->ccxt>=1.66.32->finrl==0.3.5) (1.15.1)\n",
            "Collecting pandas-datareader>=0.2\n",
            "  Downloading pandas_datareader-0.10.0-py3-none-any.whl (109 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m109.5/109.5 kB\u001b[0m \u001b[31m11.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting nvidia-ml-py<=11.495.46,>=11.450.129\n",
            "  Downloading nvidia_ml_py-11.495.46-py3-none-any.whl (25 kB)\n",
            "Collecting psutil>=5.6.0\n",
            "  Downloading psutil-5.9.4-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (280 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m280.2/280.2 kB\u001b[0m \u001b[31m23.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting blessed>=1.17.1\n",
            "  Downloading blessed-1.20.0-py2.py3-none-any.whl (58 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m58.4/58.4 kB\u001b[0m \u001b[31m4.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting webencodings\n",
            "  Downloading webencodings-0.5.1-py2.py3-none-any.whl (11 kB)\n",
            "Collecting prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30\n",
            "  Downloading prompt_toolkit-3.0.38-py3-none-any.whl (385 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m385.8/385.8 kB\u001b[0m \u001b[31m27.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting traitlets>=5\n",
            "  Downloading traitlets-5.9.0-py3-none-any.whl (117 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m117.4/117.4 kB\u001b[0m \u001b[31m9.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting typing-extensions\n",
            "  Downloading typing_extensions-4.5.0-py3-none-any.whl (27 kB)\n",
            "Collecting pygments>=2.4.0\n",
            "  Downloading Pygments-2.14.0-py3-none-any.whl (1.1 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.1/1.1 MB\u001b[0m \u001b[31m51.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting jedi>=0.16\n",
            "  Downloading jedi-0.18.2-py2.py3-none-any.whl (1.6 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.6/1.6 MB\u001b[0m \u001b[31m59.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting stack-data\n",
            "  Downloading stack_data-0.6.2-py3-none-any.whl (24 kB)\n",
            "Collecting decorator\n",
            "  Downloading decorator-5.1.1-py3-none-any.whl (9.1 kB)\n",
            "Collecting matplotlib-inline\n",
            "  Downloading matplotlib_inline-0.1.6-py3-none-any.whl (9.4 kB)\n",
            "Collecting pexpect>4.3\n",
            "  Downloading pexpect-4.8.0-py2.py3-none-any.whl (59 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m59.0/59.0 kB\u001b[0m \u001b[31m6.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting backcall\n",
            "  Downloading backcall-0.2.0-py2.py3-none-any.whl (11 kB)\n",
            "Collecting pickleshare\n",
            "  Downloading pickleshare-0.7.5-py2.py3-none-any.whl (6.9 kB)\n",
            "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.9/site-packages (from requests<3,>2->alpaca_trade_api>=2.1.0->finrl==0.3.5) (3.4)\n",
            "Collecting greenlet!=0.4.17\n",
            "  Downloading greenlet-2.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (610 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m610.9/610.9 kB\u001b[0m \u001b[31m32.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting ply<4.0,>=3.4\n",
            "  Downloading ply-3.11-py2.py3-none-any.whl (49 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m49.6/49.6 kB\u001b[0m \u001b[31m5.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting nvidia-cublas-cu11==11.10.3.66\n",
            "  Downloading nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m317.1/317.1 MB\u001b[0m \u001b[31m4.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting nvidia-cusparse-cu11==11.7.4.91\n",
            "  Downloading nvidia_cusparse_cu11-11.7.4.91-py3-none-manylinux1_x86_64.whl (173.2 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m173.2/173.2 MB\u001b[0m \u001b[31m4.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting nvidia-cusolver-cu11==11.4.0.1\n",
            "  Downloading nvidia_cusolver_cu11-11.4.0.1-2-py3-none-manylinux1_x86_64.whl (102.6 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m102.6/102.6 MB\u001b[0m \u001b[31m7.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting triton==2.0.0\n",
            "  Downloading triton-2.0.0-1-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (63.3 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m63.3/63.3 MB\u001b[0m \u001b[31m9.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting nvidia-cuda-runtime-cu11==11.7.99\n",
            "  Downloading nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m849.3/849.3 kB\u001b[0m \u001b[31m28.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting nvidia-cufft-cu11==10.9.0.58\n",
            "  Downloading nvidia_cufft_cu11-10.9.0.58-py3-none-manylinux1_x86_64.whl (168.4 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m168.4/168.4 MB\u001b[0m \u001b[31m5.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting nvidia-cudnn-cu11==8.5.0.96\n",
            "  Downloading nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m557.1/557.1 MB\u001b[0m \u001b[31m2.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting jinja2\n",
            "  Downloading Jinja2-3.1.2-py3-none-any.whl (133 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m133.1/133.1 kB\u001b[0m \u001b[31m14.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting networkx\n",
            "  Downloading networkx-3.0-py3-none-any.whl (2.0 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.0/2.0 MB\u001b[0m \u001b[31m73.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting sympy\n",
            "  Downloading sympy-1.11.1-py3-none-any.whl (6.5 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m6.5/6.5 MB\u001b[0m \u001b[31m83.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting nvidia-cuda-nvrtc-cu11==11.7.99\n",
            "  Downloading nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m21.0/21.0 MB\u001b[0m \u001b[31m55.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting nvidia-cuda-cupti-cu11==11.7.101\n",
            "  Downloading nvidia_cuda_cupti_cu11-11.7.101-py3-none-manylinux1_x86_64.whl (11.8 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m11.8/11.8 MB\u001b[0m \u001b[31m54.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting nvidia-curand-cu11==10.2.10.91\n",
            "  Downloading nvidia_curand_cu11-10.2.10.91-py3-none-manylinux1_x86_64.whl (54.6 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m54.6/54.6 MB\u001b[0m \u001b[31m9.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting nvidia-nvtx-cu11==11.7.91\n",
            "  Downloading nvidia_nvtx_cu11-11.7.91-py3-none-manylinux1_x86_64.whl (98 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m98.6/98.6 kB\u001b[0m \u001b[31m10.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting nvidia-nccl-cu11==2.14.3\n",
            "  Downloading nvidia_nccl_cu11-2.14.3-py3-none-manylinux1_x86_64.whl (177.1 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m177.1/177.1 MB\u001b[0m \u001b[31m5.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hRequirement already satisfied: wheel in /usr/local/lib/python3.9/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=1.11->stable-baselines3<2.0.0,>=1.6.2->finrl==0.3.5) (0.38.4)\n",
            "Collecting cmake\n",
            "  Downloading cmake-3.26.1-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (24.0 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m24.0/24.0 MB\u001b[0m \u001b[31m34.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting lit\n",
            "  Downloading lit-16.0.0.tar.gz (144 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m145.0/145.0 kB\u001b[0m \u001b[31m14.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "Collecting platformdirs<4,>=2.4\n",
            "  Downloading platformdirs-3.2.0-py3-none-any.whl (14 kB)\n",
            "Collecting distlib<1,>=0.3.6\n",
            "  Downloading distlib-0.3.6-py2.py3-none-any.whl (468 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m468.5/468.5 kB\u001b[0m \u001b[31m34.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting gym[box2d]\n",
            "  Downloading gym-0.26.1.tar.gz (719 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m719.9/719.9 kB\u001b[0m \u001b[31m40.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
            "  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
            "  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
            "  Downloading gym-0.26.0.tar.gz (710 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m710.3/710.3 kB\u001b[0m \u001b[31m45.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
            "  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
            "  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
            "  Downloading gym-0.25.2.tar.gz (734 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m734.5/734.5 kB\u001b[0m \u001b[31m47.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
            "  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
            "  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
            "  Downloading gym-0.25.1.tar.gz (732 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m732.2/732.2 kB\u001b[0m \u001b[31m37.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
            "  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
            "  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
            "  Downloading gym-0.25.0.tar.gz (720 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m720.4/720.4 kB\u001b[0m \u001b[31m35.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
            "  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
            "  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
            "  Downloading gym-0.24.1.tar.gz (696 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m696.4/696.4 kB\u001b[0m \u001b[31m38.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
            "  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
            "  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
            "  Downloading gym-0.24.0.tar.gz (694 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m694.4/694.4 kB\u001b[0m \u001b[31m46.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
            "  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
            "  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
            "  Downloading gym-0.23.1.tar.gz (626 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m626.2/626.2 kB\u001b[0m \u001b[31m45.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
            "  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
            "  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
            "  Downloading gym-0.23.0.tar.gz (624 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m624.4/624.4 kB\u001b[0m \u001b[31m30.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
            "  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
            "  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
            "  Downloading gym-0.22.0.tar.gz (631 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m631.1/631.1 kB\u001b[0m \u001b[31m30.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
            "  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
            "  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
            "Collecting box2d-py==2.3.5\n",
            "  Downloading box2d-py-2.3.5.tar.gz (374 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m374.4/374.4 kB\u001b[0m \u001b[31m27.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "Collecting pyglet>=1.4.0\n",
            "  Downloading pyglet-2.0.5-py3-none-any.whl (831 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m831.3/831.3 kB\u001b[0m \u001b[31m52.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0\n",
            "  Downloading pyrsistent-0.19.3-py3-none-any.whl (57 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m57.5/57.5 kB\u001b[0m \u001b[31m7.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting google-api-core<3.0.0,>=1.0.0\n",
            "  Downloading google_api_core-2.11.0-py3-none-any.whl (120 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m120.3/120.3 kB\u001b[0m \u001b[31m14.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting opencensus-context>=0.1.3\n",
            "  Downloading opencensus_context-0.1.3-py2.py3-none-any.whl (5.1 kB)\n",
            "Collecting wcwidth>=0.1.4\n",
            "  Downloading wcwidth-0.2.6-py2.py3-none-any.whl (29 kB)\n",
            "Requirement already satisfied: pycparser in /usr/local/lib/python3.9/site-packages (from cffi>=1.12->cryptography>=2.6.1->ccxt>=1.66.32->finrl==0.3.5) (2.21)\n",
            "Collecting google-auth<3.0dev,>=2.14.1\n",
            "  Downloading google_auth-2.17.1-py2.py3-none-any.whl (178 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m178.1/178.1 kB\u001b[0m \u001b[31m19.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting googleapis-common-protos<2.0dev,>=1.56.2\n",
            "  Downloading googleapis_common_protos-1.59.0-py2.py3-none-any.whl (223 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m223.6/223.6 kB\u001b[0m \u001b[31m24.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting parso<0.9.0,>=0.8.0\n",
            "  Downloading parso-0.8.3-py2.py3-none-any.whl (100 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m100.8/100.8 kB\u001b[0m \u001b[31m11.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting ptyprocess>=0.5\n",
            "  Downloading ptyprocess-0.7.0-py2.py3-none-any.whl (13 kB)\n",
            "Collecting MarkupSafe>=2.0\n",
            "  Downloading MarkupSafe-2.1.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)\n",
            "Collecting pure-eval\n",
            "  Downloading pure_eval-0.2.2-py3-none-any.whl (11 kB)\n",
            "Collecting executing>=1.2.0\n",
            "  Downloading executing-1.2.0-py2.py3-none-any.whl (24 kB)\n",
            "Collecting asttokens>=2.1.0\n",
            "  Downloading asttokens-2.2.1-py2.py3-none-any.whl (26 kB)\n",
            "Collecting mpmath>=0.19\n",
            "  Downloading mpmath-1.3.0-py3-none-any.whl (536 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m536.2/536.2 kB\u001b[0m \u001b[31m43.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting rsa<5,>=3.1.4\n",
            "  Downloading rsa-4.9-py3-none-any.whl (34 kB)\n",
            "Collecting pyasn1-modules>=0.2.1\n",
            "  Downloading pyasn1_modules-0.2.8-py2.py3-none-any.whl (155 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m155.3/155.3 kB\u001b[0m \u001b[31m18.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting cachetools<6.0,>=2.0.0\n",
            "  Downloading cachetools-5.3.0-py3-none-any.whl (9.3 kB)\n",
            "Collecting pyasn1<0.5.0,>=0.4.6\n",
            "  Downloading pyasn1-0.4.8-py2.py3-none-any.whl (77 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m77.1/77.1 kB\u001b[0m \u001b[31m8.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hBuilding wheels for collected packages: finrl, exchange_calendars, gym, elegantrl, gputil, pyfolio, empyrical, gpustat, thriftpy2, box2d-py, lit\n",
            "  Building wheel for finrl (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
            "  Created wheel for finrl: filename=finrl-0.3.5-py3-none-any.whl size=4668526 sha256=f55b41d967223f3a27e4404f4d382736d5c5cbc2e1cfd7736de48586998df184\n",
            "  Stored in directory: /tmp/pip-ephem-wheel-cache-fopkegt4/wheels/ec/6a/08/c43694890a7c5a62c23af4b2a497bce5ee7edef607852cf53f\n",
            "  Building wheel for exchange_calendars (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "  Created wheel for exchange_calendars: filename=exchange_calendars-3.6.3-py3-none-any.whl size=182636 sha256=dfc7624a26c2bb0463c6ce659d9f6d0b32cfc6cd49c3ef7d0c9ddcd09b49398e\n",
            "  Stored in directory: /root/.cache/pip/wheels/4e/02/f9/6c6eeb48a242879e357caf2813953fa8b6e26bd0110bd94226\n",
            "  Building wheel for gym (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "  Created wheel for gym: filename=gym-0.21.0-py3-none-any.whl size=1616819 sha256=faa02f535febfd071894e5017afb8902151ba8973f93e252400e27bd560366a9\n",
            "  Stored in directory: /root/.cache/pip/wheels/b3/50/6c/0a82c1358b4da2dbd9c1bb17e0f89467db32812ab236dbf6d5\n",
            "  Building wheel for elegantrl (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "  Created wheel for elegantrl: filename=elegantrl-0.3.6-py3-none-any.whl size=195053 sha256=2a848a648928884e7570c74b8f16d0b19603669d5bebb6b3b99cb281e10c68d4\n",
            "  Stored in directory: /tmp/pip-ephem-wheel-cache-fopkegt4/wheels/a3/c3/be/03eb1f20c8650f23ab13b823d93a297a917899f5d08b04b7b9\n",
            "  Building wheel for gputil (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "  Created wheel for gputil: filename=GPUtil-1.4.0-py3-none-any.whl size=7409 sha256=c2ab06060122ac074b17acd6bb902ed0348de375b21d7374d13cb0a73b1cb835\n",
            "  Stored in directory: /root/.cache/pip/wheels/2b/b5/24/fbb56595c286984f7315ee31821d6121e1b9828436021a88b3\n",
            "  Building wheel for pyfolio (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "  Created wheel for pyfolio: filename=pyfolio-0.9.2+75.g4b901f6-py3-none-any.whl size=75773 sha256=481aca931367143b883ff1cb591a7eae1cddbc83674ba99b77d6ce45458c74c3\n",
            "  Stored in directory: /tmp/pip-ephem-wheel-cache-fopkegt4/wheels/da/0d/dd/aef7001cc1238aff04ec9eabfc002341f00c50deead3083855\n",
            "  Building wheel for empyrical (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "  Created wheel for empyrical: filename=empyrical-0.5.5-py3-none-any.whl size=39776 sha256=058a2495ba35828b597997dc4fbaaf2766894aea421a9a0dabfda0298293af95\n",
            "  Stored in directory: /root/.cache/pip/wheels/67/23/d1/a4ef8ff88dc9af7b0eeb1b6fd0d90c6057eaad5a2df25f4e3f\n",
            "  Building wheel for gpustat (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "  Created wheel for gpustat: filename=gpustat-1.0.0-py3-none-any.whl size=19884 sha256=58bed7ad89eddee8af26ea29c48037dc74474fdcbdf76fab1dc9fa0a6e1d5f51\n",
            "  Stored in directory: /root/.cache/pip/wheels/ce/13/aa/145d9d670feb2cf4a0691b9a3552aafc8a1b49c5162a0f379d\n",
            "  Building wheel for thriftpy2 (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "  Created wheel for thriftpy2: filename=thriftpy2-0.4.16-cp39-cp39-linux_x86_64.whl size=529342 sha256=97b595d2db82c5223de2f982f830baeffdd4e97880ff9a00476539b23f777b33\n",
            "  Stored in directory: /root/.cache/pip/wheels/88/a4/d5/907737b4c175aec82087b815fa93a8afea5c6c5a3e7bb748b9\n",
            "  Building wheel for box2d-py (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "  Created wheel for box2d-py: filename=box2d_py-2.3.5-cp39-cp39-linux_x86_64.whl size=494643 sha256=714bbf157f1f852d527e4c97b5701cd21ce2c606cdb9d6a1281eeb83f722d8c2\n",
            "  Stored in directory: /root/.cache/pip/wheels/a4/c2/c1/076651c394f05fe60990cd85616c2d95bc1619aa113f559d7d\n",
            "  Building wheel for lit (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "  Created wheel for lit: filename=lit-16.0.0-py3-none-any.whl size=93598 sha256=4af1790b3bda595a7e575e84d292c54d3d2ab3e723ca3d49e5dc43981b53e392\n",
            "  Stored in directory: /root/.cache/pip/wheels/c7/ee/80/1520ca86c3557f70e5504b802072f7fc3b0e2147f376b133ed\n",
            "Successfully built finrl exchange_calendars gym elegantrl gputil pyfolio empyrical gpustat thriftpy2 box2d-py lit\n",
            "Installing collected packages: webencodings, wcwidth, pytz, pyglet, pyasn1, py-spy, pure-eval, ptyprocess, ply, pickleshare, opencensus-context, nvidia-ml-py, multitasking, msgpack, mpmath, lit, korean_lunar_calendar, gputil, executing, distlib, colorful, cmake, box2d-py, backcall, appdirs, zipp, websockets, websocket-client, typing-extensions, traitlets, threadpoolctl, tabulate, sympy, soupsieve, smart-open, six, rsa, PyYAML, pyrsistent, pyparsing, pymysql, pyluach, pygments, pyasn1-modules, psycopg2-binary, psutil, protobuf, prompt-toolkit, prometheus-client, platformdirs, pillow, pexpect, parso, packaging, nvidia-nvtx-cu11, nvidia-nccl-cu11, nvidia-cusparse-cu11, nvidia-curand-cu11, nvidia-cufft-cu11, nvidia-cuda-runtime-cu11, nvidia-cuda-nvrtc-cu11, nvidia-cuda-cupti-cu11, nvidia-cublas-cu11, numpy, networkx, multidict, MarkupSafe, lz4, lxml, kiwisolver, joblib, grpcio, greenlet, frozenlist, frozendict, fonttools, filelock, decorator, cycler, cloudpickle, click, cachetools, attrs, async-timeout, yarl, virtualenv, thriftpy2, tensorboardX, sqlalchemy, scipy, python-dateutil, pydantic, pycares, nvidia-cusolver-cu11, nvidia-cudnn-cu11, matplotlib-inline, jsonschema, jinja2, jedi, importlib-resources, importlib-metadata, html5lib, gym, googleapis-common-protos, google-auth, deprecation, contourpy, blessed, beautifulsoup4, asttokens, aiosignal, stack-data, scikit-learn, ray, pandas, matplotlib, gpustat, google-api-core, aiohttp, aiodns, yfinance, wrds, stockstats, seaborn, pandas-datareader, opencensus, jqdatasdk, ipython, exchange_calendars, ccxt, alpaca_trade_api, aiohttp-cors, empyrical, pyfolio, triton, torch, stable-baselines3, elegantrl, finrl\n",
            "Successfully installed MarkupSafe-2.1.2 PyYAML-6.0 aiodns-3.0.0 aiohttp-3.8.1 aiohttp-cors-0.7.0 aiosignal-1.3.1 alpaca_trade_api-3.0.0 appdirs-1.4.4 asttokens-2.2.1 async-timeout-4.0.2 attrs-22.2.0 backcall-0.2.0 beautifulsoup4-4.12.0 blessed-1.20.0 box2d-py-2.3.5 cachetools-5.3.0 ccxt-3.0.46 click-8.1.3 cloudpickle-2.2.1 cmake-3.26.1 colorful-0.5.5 contourpy-1.0.7 cycler-0.11.0 decorator-5.1.1 deprecation-2.1.0 distlib-0.3.6 elegantrl-0.3.6 empyrical-0.5.5 exchange_calendars-3.6.3 executing-1.2.0 filelock-3.10.7 finrl-0.3.5 fonttools-4.39.3 frozendict-2.3.6 frozenlist-1.3.3 google-api-core-2.11.0 google-auth-2.17.1 googleapis-common-protos-1.59.0 gpustat-1.0.0 gputil-1.4.0 greenlet-2.0.2 grpcio-1.53.0 gym-0.21.0 html5lib-1.1 importlib-metadata-4.13.0 importlib-resources-5.12.0 ipython-8.12.0 jedi-0.18.2 jinja2-3.1.2 joblib-1.2.0 jqdatasdk-1.8.11 jsonschema-4.17.3 kiwisolver-1.4.4 korean_lunar_calendar-0.3.1 lit-16.0.0 lxml-4.9.2 lz4-4.3.2 matplotlib-3.7.1 matplotlib-inline-0.1.6 mpmath-1.3.0 msgpack-1.0.3 multidict-6.0.4 multitasking-0.0.11 networkx-3.0 numpy-1.24.2 nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-cupti-cu11-11.7.101 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 nvidia-cufft-cu11-10.9.0.58 nvidia-curand-cu11-10.2.10.91 nvidia-cusolver-cu11-11.4.0.1 nvidia-cusparse-cu11-11.7.4.91 nvidia-ml-py-11.495.46 nvidia-nccl-cu11-2.14.3 nvidia-nvtx-cu11-11.7.91 opencensus-0.11.2 opencensus-context-0.1.3 packaging-23.0 pandas-1.5.3 pandas-datareader-0.10.0 parso-0.8.3 pexpect-4.8.0 pickleshare-0.7.5 pillow-9.4.0 platformdirs-3.2.0 ply-3.11 prometheus-client-0.16.0 prompt-toolkit-3.0.38 protobuf-3.20.3 psutil-5.9.4 psycopg2-binary-2.9.5 ptyprocess-0.7.0 pure-eval-0.2.2 py-spy-0.3.14 pyasn1-0.4.8 pyasn1-modules-0.2.8 pycares-4.3.0 pydantic-1.10.7 pyfolio-0.9.2+75.g4b901f6 pyglet-2.0.5 pygments-2.14.0 pyluach-2.2.0 pymysql-1.0.3 pyparsing-3.0.9 pyrsistent-0.19.3 python-dateutil-2.8.2 pytz-2023.3 ray-2.3.1 rsa-4.9 scikit-learn-1.2.2 scipy-1.10.1 seaborn-0.12.2 six-1.16.0 smart-open-6.3.0 soupsieve-2.4 sqlalchemy-1.4.47 stable-baselines3-1.7.0 stack-data-0.6.2 stockstats-0.5.2 sympy-1.11.1 tabulate-0.9.0 tensorboardX-2.6 threadpoolctl-3.1.0 thriftpy2-0.4.16 torch-2.0.0 traitlets-5.9.0 triton-2.0.0 typing-extensions-4.5.0 virtualenv-20.21.0 wcwidth-0.2.6 webencodings-0.5.1 websocket-client-1.5.1 websockets-10.4 wrds-3.1.6 yarl-1.8.2 yfinance-0.2.14 zipp-3.15.0\n",
            "\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
            "\u001b[0m"
          ]
        },
        {
          "output_type": "display_data",
          "data": {
            "application/vnd.colab-display-data+json": {
              "pip_warning": {
                "packages": [
                  "cycler",
                  "dateutil",
                  "google",
                  "importlib_resources",
                  "kiwisolver",
                  "matplotlib",
                  "matplotlib_inline",
                  "mpl_toolkits",
                  "pexpect",
                  "pickleshare",
                  "prompt_toolkit",
                  "psutil",
                  "pygments",
                  "six",
                  "wcwidth",
                  "zipp"
                ]
              }
            }
          },
          "metadata": {}
        }
      ],
      "source": [
        "## install required packages\n",
        "!pip install swig\n",
        "!pip install wrds\n",
        "!pip install pyportfolioopt\n",
        "## install finrl library\n",
        "!pip install -q condacolab\n",
        "import condacolab\n",
        "condacolab.install()\n",
        "!apt-get update -y -qq && apt-get install -y -qq cmake libopenmpi-dev python3-dev zlib1g-dev libgl1-mesa-glx swig\n",
        "!pip install git+https://github.com/AI4Finance-Foundation/FinRL.git"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "nGv01K8Sh1hn"
      },
      "source": [
        "<a id='1.3'></a>\n",
        "## 1.2. Import Packages"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "lPqeTTwoh1hn",
        "outputId": "7918ded5-5571-4aa0-c335-e5ff1ba5a94e"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stderr",
          "text": [
            "/usr/local/lib/python3.9/site-packages/pyfolio/pos.py:26: UserWarning: Module \"zipline.assets\" not found; multipliers will not be applied to position notionals.\n",
            "  warnings.warn(\n"
          ]
        }
      ],
      "source": [
        "\n",
        "\n",
        "from finrl import config\n",
        "from finrl import config_tickers\n",
        "from finrl.agents.stablebaselines3.models import DRLAgent\n",
        "from finrl.config import DATA_SAVE_DIR\n",
        "from finrl.config import INDICATORS\n",
        "from finrl.config import RESULTS_DIR\n",
        "from finrl.config import TENSORBOARD_LOG_DIR\n",
        "from finrl.config import TEST_END_DATE\n",
        "from finrl.config import TEST_START_DATE\n",
        "from finrl.config import TRAINED_MODEL_DIR\n",
        "from finrl.config_tickers import DOW_30_TICKER\n",
        "from finrl.main import check_and_make_directories\n",
        "from finrl.meta.data_processor import DataProcessor\n",
        "from finrl.meta.data_processors.func import calc_train_trade_data\n",
        "from finrl.meta.data_processors.func import calc_train_trade_starts_ends_if_rolling\n",
        "from finrl.meta.data_processors.func import date2str\n",
        "from finrl.meta.data_processors.func import str2date\n",
        "from finrl.meta.env_stock_trading.env_stocktrading import StockTradingEnv\n",
        "from finrl.meta.preprocessor.preprocessors import data_split\n",
        "from finrl.meta.preprocessor.preprocessors import FeatureEngineer\n",
        "from finrl.meta.preprocessor.yahoodownloader import YahooDownloader\n",
        "from finrl.plot import backtest_plot\n",
        "from finrl.plot import backtest_stats\n",
        "from finrl.plot import get_baseline\n",
        "from finrl.plot import get_daily_return\n",
        "from finrl.plot import plot_return\n",
        "from finrl.applications.stock_trading.stock_trading_rolling_window import stock_trading_rolling_window\n",
        "import sys\n",
        "sys.path.append(\"../FinRL\")\n",
        "\n",
        "import itertools"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "T2owTj985RW4"
      },
      "source": [
        "<a id='1.4'></a>\n",
        "# 2 Set parameters and run\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "RtUc_ofKmpdy",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "203fec48-d3fa-48fe-ec40-eda9a2799c48"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "\u001b[1;30;43m流式输出内容被截断，只能显示最后 5000 行内容。\u001b[0m\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 52.5        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 10          |\n",
            "|    time_elapsed         | 316         |\n",
            "|    total_timesteps      | 20480       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.018726377 |\n",
            "|    clip_fraction        | 0.228       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.6       |\n",
            "|    explained_variance   | -0.00599    |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 10          |\n",
            "|    n_updates            | 90          |\n",
            "|    policy_gradient_loss | -0.0229     |\n",
            "|    reward               | 1.8604985   |\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 34          |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 11          |\n",
            "|    time_elapsed         | 350         |\n",
            "|    total_timesteps      | 22528       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.017771121 |\n",
            "|    clip_fraction        | 0.201       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.7       |\n",
            "|    explained_variance   | -0.00452    |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 102         |\n",
            "|    n_updates            | 100         |\n",
            "|    policy_gradient_loss | -0.0176     |\n",
            "|    reward               | 2.4363315   |\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 257         |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 12          |\n",
            "|    time_elapsed         | 380         |\n",
            "|    total_timesteps      | 24576       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.021592125 |\n",
            "|    clip_fraction        | 0.24        |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.7       |\n",
            "|    explained_variance   | -0.00462    |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 13.1        |\n",
            "|    n_updates            | 110         |\n",
            "|    policy_gradient_loss | -0.0218     |\n",
            "|    reward               | -0.36686477 |\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 27.8        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 63          |\n",
            "|    iterations           | 13          |\n",
            "|    time_elapsed         | 417         |\n",
            "|    total_timesteps      | 26624       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.016095877 |\n",
            "|    clip_fraction        | 0.171       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.8       |\n",
            "|    explained_variance   | 0.00607     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 63.8        |\n",
            "|    n_updates            | 120         |\n",
            "|    policy_gradient_loss | -0.0175     |\n",
            "|    reward               | -6.2590113  |\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 161         |\n",
            "-----------------------------------------\n",
            "----------------------------------------\n",
            "| time/                   |            |\n",
            "|    fps                  | 64         |\n",
            "|    iterations           | 14         |\n",
            "|    time_elapsed         | 447        |\n",
            "|    total_timesteps      | 28672      |\n",
            "| train/                  |            |\n",
            "|    approx_kl            | 0.02099569 |\n",
            "|    clip_fraction        | 0.204      |\n",
            "|    clip_range           | 0.2        |\n",
            "|    entropy_loss         | -41.9      |\n",
            "|    explained_variance   | 0.00587    |\n",
            "|    learning_rate        | 0.00025    |\n",
            "|    loss                 | 18.1       |\n",
            "|    n_updates            | 130        |\n",
            "|    policy_gradient_loss | -0.0176    |\n",
            "|    reward               | -1.5635415 |\n",
            "|    std                  | 1.03       |\n",
            "|    value_loss           | 76.1       |\n",
            "----------------------------------------\n",
            "day: 3374, episode: 10\n",
            "begin_total_asset: 1017321.61\n",
            "end_total_asset: 4690150.25\n",
            "total_reward: 3672828.63\n",
            "total_cost: 440655.06\n",
            "total_trades: 91574\n",
            "Sharpe: 0.777\n",
            "=================================\n",
            "------------------------------------------\n",
            "| time/                   |              |\n",
            "|    fps                  | 63           |\n",
            "|    iterations           | 15           |\n",
            "|    time_elapsed         | 480          |\n",
            "|    total_timesteps      | 30720        |\n",
            "| train/                  |              |\n",
            "|    approx_kl            | 0.01574407   |\n",
            "|    clip_fraction        | 0.252        |\n",
            "|    clip_range           | 0.2          |\n",
            "|    entropy_loss         | -41.9        |\n",
            "|    explained_variance   | 0.045        |\n",
            "|    learning_rate        | 0.00025      |\n",
            "|    loss                 | 8.21         |\n",
            "|    n_updates            | 140          |\n",
            "|    policy_gradient_loss | -0.0207      |\n",
            "|    reward               | -0.058135245 |\n",
            "|    std                  | 1.03         |\n",
            "|    value_loss           | 20           |\n",
            "------------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 16          |\n",
            "|    time_elapsed         | 511         |\n",
            "|    total_timesteps      | 32768       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.018864237 |\n",
            "|    clip_fraction        | 0.19        |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42         |\n",
            "|    explained_variance   | -0.0334     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 40.5        |\n",
            "|    n_updates            | 150         |\n",
            "|    policy_gradient_loss | -0.0158     |\n",
            "|    reward               | 2.1892703   |\n",
            "|    std                  | 1.03        |\n",
            "|    value_loss           | 80.4        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 17          |\n",
            "|    time_elapsed         | 542         |\n",
            "|    total_timesteps      | 34816       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.025924759 |\n",
            "|    clip_fraction        | 0.183       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42         |\n",
            "|    explained_variance   | -0.0494     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 8.64        |\n",
            "|    n_updates            | 160         |\n",
            "|    policy_gradient_loss | -0.0154     |\n",
            "|    reward               | -1.6194284  |\n",
            "|    std                  | 1.03        |\n",
            "|    value_loss           | 19.1        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 63          |\n",
            "|    iterations           | 18          |\n",
            "|    time_elapsed         | 576         |\n",
            "|    total_timesteps      | 36864       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.023486339 |\n",
            "|    clip_fraction        | 0.227       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42         |\n",
            "|    explained_variance   | -0.00164    |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 71          |\n",
            "|    n_updates            | 170         |\n",
            "|    policy_gradient_loss | -0.0128     |\n",
            "|    reward               | -6.5787015  |\n",
            "|    std                  | 1.03        |\n",
            "|    value_loss           | 175         |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 63          |\n",
            "|    iterations           | 19          |\n",
            "|    time_elapsed         | 609         |\n",
            "|    total_timesteps      | 38912       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.047546946 |\n",
            "|    clip_fraction        | 0.278       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42         |\n",
            "|    explained_variance   | 0.0083      |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 22.2        |\n",
            "|    n_updates            | 180         |\n",
            "|    policy_gradient_loss | -0.00743    |\n",
            "|    reward               | 3.6853487   |\n",
            "|    std                  | 1.03        |\n",
            "|    value_loss           | 88.3        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 63          |\n",
            "|    iterations           | 20          |\n",
            "|    time_elapsed         | 643         |\n",
            "|    total_timesteps      | 40960       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.028585846 |\n",
            "|    clip_fraction        | 0.238       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42.1       |\n",
            "|    explained_variance   | -0.018      |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 12.6        |\n",
            "|    n_updates            | 190         |\n",
            "|    policy_gradient_loss | -0.0166     |\n",
            "|    reward               | 2.84366     |\n",
            "|    std                  | 1.03        |\n",
            "|    value_loss           | 35.7        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 63          |\n",
            "|    iterations           | 21          |\n",
            "|    time_elapsed         | 672         |\n",
            "|    total_timesteps      | 43008       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.021615773 |\n",
            "|    clip_fraction        | 0.283       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42.1       |\n",
            "|    explained_variance   | 0.0164      |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 39.1        |\n",
            "|    n_updates            | 200         |\n",
            "|    policy_gradient_loss | -0.0119     |\n",
            "|    reward               | 7.260352    |\n",
            "|    std                  | 1.04        |\n",
            "|    value_loss           | 85.5        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 22          |\n",
            "|    time_elapsed         | 703         |\n",
            "|    total_timesteps      | 45056       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.023984132 |\n",
            "|    clip_fraction        | 0.174       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42.2       |\n",
            "|    explained_variance   | -0.0214     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 10.8        |\n",
            "|    n_updates            | 210         |\n",
            "|    policy_gradient_loss | -0.015      |\n",
            "|    reward               | 0.7453349   |\n",
            "|    std                  | 1.04        |\n",
            "|    value_loss           | 27.4        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 63          |\n",
            "|    iterations           | 23          |\n",
            "|    time_elapsed         | 736         |\n",
            "|    total_timesteps      | 47104       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.026311198 |\n",
            "|    clip_fraction        | 0.239       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42.2       |\n",
            "|    explained_variance   | 0.0117      |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 53.5        |\n",
            "|    n_updates            | 220         |\n",
            "|    policy_gradient_loss | -0.0147     |\n",
            "|    reward               | -3.601917   |\n",
            "|    std                  | 1.04        |\n",
            "|    value_loss           | 109         |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 24          |\n",
            "|    time_elapsed         | 765         |\n",
            "|    total_timesteps      | 49152       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.021329464 |\n",
            "|    clip_fraction        | 0.228       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42.2       |\n",
            "|    explained_variance   | 0.0287      |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 35.5        |\n",
            "|    n_updates            | 230         |\n",
            "|    policy_gradient_loss | -0.0174     |\n",
            "|    reward               | -1.4932549  |\n",
            "|    std                  | 1.04        |\n",
            "|    value_loss           | 69.7        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 25          |\n",
            "|    time_elapsed         | 799         |\n",
            "|    total_timesteps      | 51200       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.033834375 |\n",
            "|    clip_fraction        | 0.347       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42.2       |\n",
            "|    explained_variance   | -0.0439     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 11.2        |\n",
            "|    n_updates            | 240         |\n",
            "|    policy_gradient_loss | -0.0175     |\n",
            "|    reward               | -0.13293022 |\n",
            "|    std                  | 1.04        |\n",
            "|    value_loss           | 31.1        |\n",
            "-----------------------------------------\n",
            "{'batch_size': 128, 'buffer_size': 100000, 'learning_rate': 0.0001, 'learning_starts': 100, 'ent_coef': 'auto_0.1'}\n",
            "Using cpu device\n",
            "Logging to results/sac\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 4          |\n",
            "|    fps             | 19         |\n",
            "|    time_elapsed    | 693        |\n",
            "|    total_timesteps | 13500      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 1.23e+03   |\n",
            "|    critic_loss     | 941        |\n",
            "|    ent_coef        | 0.175      |\n",
            "|    ent_coef_loss   | -80.9      |\n",
            "|    learning_rate   | 0.0001     |\n",
            "|    n_updates       | 13399      |\n",
            "|    reward          | -4.1185117 |\n",
            "-----------------------------------\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 8          |\n",
            "|    fps             | 19         |\n",
            "|    time_elapsed    | 1407       |\n",
            "|    total_timesteps | 27000      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 486        |\n",
            "|    critic_loss     | 378        |\n",
            "|    ent_coef        | 0.047      |\n",
            "|    ent_coef_loss   | -97.4      |\n",
            "|    learning_rate   | 0.0001     |\n",
            "|    n_updates       | 26899      |\n",
            "|    reward          | -6.0287046 |\n",
            "-----------------------------------\n",
            "day: 3374, episode: 10\n",
            "begin_total_asset: 1039580.61\n",
            "end_total_asset: 4449383.64\n",
            "total_reward: 3409803.03\n",
            "total_cost: 3171.06\n",
            "total_trades: 48397\n",
            "Sharpe: 0.687\n",
            "=================================\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 12         |\n",
            "|    fps             | 19         |\n",
            "|    time_elapsed    | 2123       |\n",
            "|    total_timesteps | 40500      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 201        |\n",
            "|    critic_loss     | 10.8       |\n",
            "|    ent_coef        | 0.0131     |\n",
            "|    ent_coef_loss   | -63.2      |\n",
            "|    learning_rate   | 0.0001     |\n",
            "|    n_updates       | 40399      |\n",
            "|    reward          | -5.6925883 |\n",
            "-----------------------------------\n",
            "{'batch_size': 100, 'buffer_size': 1000000, 'learning_rate': 0.001}\n",
            "Using cpu device\n",
            "Logging to results/td3\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 4          |\n",
            "|    fps             | 24         |\n",
            "|    time_elapsed    | 545        |\n",
            "|    total_timesteps | 13500      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 16.7       |\n",
            "|    critic_loss     | 341        |\n",
            "|    learning_rate   | 0.001      |\n",
            "|    n_updates       | 10125      |\n",
            "|    reward          | -5.7216434 |\n",
            "-----------------------------------\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 8          |\n",
            "|    fps             | 21         |\n",
            "|    time_elapsed    | 1228       |\n",
            "|    total_timesteps | 27000      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 18.7       |\n",
            "|    critic_loss     | 21.4       |\n",
            "|    learning_rate   | 0.001      |\n",
            "|    n_updates       | 23625      |\n",
            "|    reward          | -5.7216434 |\n",
            "-----------------------------------\n",
            "day: 3374, episode: 10\n",
            "begin_total_asset: 1043903.24\n",
            "end_total_asset: 5291054.90\n",
            "total_reward: 4247151.66\n",
            "total_cost: 1042.86\n",
            "total_trades: 64106\n",
            "Sharpe: 0.723\n",
            "=================================\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 12         |\n",
            "|    fps             | 21         |\n",
            "|    time_elapsed    | 1923       |\n",
            "|    total_timesteps | 40500      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 20.6       |\n",
            "|    critic_loss     | 15.9       |\n",
            "|    learning_rate   | 0.001      |\n",
            "|    n_updates       | 37125      |\n",
            "|    reward          | -5.7216434 |\n",
            "-----------------------------------\n",
            "hit end!\n",
            "hit end!\n",
            "hit end!\n",
            "hit end!\n",
            "hit end!\n",
            "[*********************100%***********************]  1 of 1 completed\n",
            "Shape of DataFrame:  (22, 8)\n",
            "i:  2\n",
            "{'n_steps': 5, 'ent_coef': 0.01, 'learning_rate': 0.0007}\n",
            "Using cpu device\n",
            "Logging to results/a2c\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 55          |\n",
            "|    iterations         | 100         |\n",
            "|    time_elapsed       | 9           |\n",
            "|    total_timesteps    | 500         |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41         |\n",
            "|    explained_variance | 0.0552      |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 99          |\n",
            "|    policy_loss        | -125        |\n",
            "|    reward             | -0.19224237 |\n",
            "|    std                | 0.997       |\n",
            "|    value_loss         | 10.9        |\n",
            "---------------------------------------\n",
            "------------------------------------\n",
            "| time/                 |          |\n",
            "|    fps                | 65       |\n",
            "|    iterations         | 200      |\n",
            "|    time_elapsed       | 15       |\n",
            "|    total_timesteps    | 1000     |\n",
            "| train/                |          |\n",
            "|    entropy_loss       | -41.1    |\n",
            "|    explained_variance | 0        |\n",
            "|    learning_rate      | 0.0007   |\n",
            "|    n_updates          | 199      |\n",
            "|    policy_loss        | -65.7    |\n",
            "|    reward             | 2.47076  |\n",
            "|    std                | 0.998    |\n",
            "|    value_loss         | 3.14     |\n",
            "------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 300       |\n",
            "|    time_elapsed       | 24        |\n",
            "|    total_timesteps    | 1500      |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.1     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 299       |\n",
            "|    policy_loss        | 221       |\n",
            "|    reward             | -0.668967 |\n",
            "|    std                | 0.999     |\n",
            "|    value_loss         | 38        |\n",
            "-------------------------------------\n",
            "------------------------------------\n",
            "| time/                 |          |\n",
            "|    fps                | 61       |\n",
            "|    iterations         | 400      |\n",
            "|    time_elapsed       | 32       |\n",
            "|    total_timesteps    | 2000     |\n",
            "| train/                |          |\n",
            "|    entropy_loss       | -41      |\n",
            "|    explained_variance | 5.96e-08 |\n",
            "|    learning_rate      | 0.0007   |\n",
            "|    n_updates          | 399      |\n",
            "|    policy_loss        | 2.8      |\n",
            "|    reward             | 2.104001 |\n",
            "|    std                | 0.997    |\n",
            "|    value_loss         | 2.71     |\n",
            "------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 64        |\n",
            "|    iterations         | 500       |\n",
            "|    time_elapsed       | 39        |\n",
            "|    total_timesteps    | 2500      |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.1     |\n",
            "|    explained_variance | -1.19e-07 |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 499       |\n",
            "|    policy_loss        | 239       |\n",
            "|    reward             | 3.0126274 |\n",
            "|    std                | 0.999     |\n",
            "|    value_loss         | 39.6      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 600       |\n",
            "|    time_elapsed       | 48        |\n",
            "|    total_timesteps    | 3000      |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.2     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 599       |\n",
            "|    policy_loss        | -524      |\n",
            "|    reward             | 3.9946847 |\n",
            "|    std                | 1         |\n",
            "|    value_loss         | 293       |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 62        |\n",
            "|    iterations         | 700       |\n",
            "|    time_elapsed       | 56        |\n",
            "|    total_timesteps    | 3500      |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.2     |\n",
            "|    explained_variance | 0.108     |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 699       |\n",
            "|    policy_loss        | -37.1     |\n",
            "|    reward             | 1.4987615 |\n",
            "|    std                | 1         |\n",
            "|    value_loss         | 1.8       |\n",
            "-------------------------------------\n",
            "------------------------------------\n",
            "| time/                 |          |\n",
            "|    fps                | 63       |\n",
            "|    iterations         | 800      |\n",
            "|    time_elapsed       | 62       |\n",
            "|    total_timesteps    | 4000     |\n",
            "| train/                |          |\n",
            "|    entropy_loss       | -41.2    |\n",
            "|    explained_variance | 0        |\n",
            "|    learning_rate      | 0.0007   |\n",
            "|    n_updates          | 799      |\n",
            "|    policy_loss        | -337     |\n",
            "|    reward             | 2.046587 |\n",
            "|    std                | 1        |\n",
            "|    value_loss         | 77.1     |\n",
            "------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 900       |\n",
            "|    time_elapsed       | 72        |\n",
            "|    total_timesteps    | 4500      |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.2     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 899       |\n",
            "|    policy_loss        | 26.2      |\n",
            "|    reward             | 1.0195923 |\n",
            "|    std                | 1         |\n",
            "|    value_loss         | 4.92      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 62         |\n",
            "|    iterations         | 1000       |\n",
            "|    time_elapsed       | 79         |\n",
            "|    total_timesteps    | 5000       |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.2      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 999        |\n",
            "|    policy_loss        | 80.3       |\n",
            "|    reward             | -3.5495179 |\n",
            "|    std                | 1          |\n",
            "|    value_loss         | 9.56       |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 63        |\n",
            "|    iterations         | 1100      |\n",
            "|    time_elapsed       | 86        |\n",
            "|    total_timesteps    | 5500      |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.2     |\n",
            "|    explained_variance | -2.38e-07 |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 1099      |\n",
            "|    policy_loss        | -258      |\n",
            "|    reward             | 1.6695346 |\n",
            "|    std                | 1         |\n",
            "|    value_loss         | 44.9      |\n",
            "-------------------------------------\n",
            "------------------------------------\n",
            "| time/                 |          |\n",
            "|    fps                | 62       |\n",
            "|    iterations         | 1200     |\n",
            "|    time_elapsed       | 96       |\n",
            "|    total_timesteps    | 6000     |\n",
            "| train/                |          |\n",
            "|    entropy_loss       | -41.1    |\n",
            "|    explained_variance | -0.00397 |\n",
            "|    learning_rate      | 0.0007   |\n",
            "|    n_updates          | 1199     |\n",
            "|    policy_loss        | 185      |\n",
            "|    reward             | 2.245284 |\n",
            "|    std                | 1        |\n",
            "|    value_loss         | 26.3     |\n",
            "------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 60        |\n",
            "|    iterations         | 1300      |\n",
            "|    time_elapsed       | 106       |\n",
            "|    total_timesteps    | 6500      |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.1     |\n",
            "|    explained_variance | -1.19e-07 |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 1299      |\n",
            "|    policy_loss        | 150       |\n",
            "|    reward             | 1.491629  |\n",
            "|    std                | 0.998     |\n",
            "|    value_loss         | 21.5      |\n",
            "-------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 59          |\n",
            "|    iterations         | 1400        |\n",
            "|    time_elapsed       | 117         |\n",
            "|    total_timesteps    | 7000        |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41.1       |\n",
            "|    explained_variance | 0           |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 1399        |\n",
            "|    policy_loss        | -107        |\n",
            "|    reward             | -0.81370664 |\n",
            "|    std                | 0.998       |\n",
            "|    value_loss         | 8.34        |\n",
            "---------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 60        |\n",
            "|    iterations         | 1500      |\n",
            "|    time_elapsed       | 124       |\n",
            "|    total_timesteps    | 7500      |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.1     |\n",
            "|    explained_variance | -0.0459   |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 1499      |\n",
            "|    policy_loss        | -10       |\n",
            "|    reward             | 2.3688922 |\n",
            "|    std                | 0.997     |\n",
            "|    value_loss         | 0.922     |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 60         |\n",
            "|    iterations         | 1600       |\n",
            "|    time_elapsed       | 131        |\n",
            "|    total_timesteps    | 8000       |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.1      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 1599       |\n",
            "|    policy_loss        | 128        |\n",
            "|    reward             | 0.56861943 |\n",
            "|    std                | 0.998      |\n",
            "|    value_loss         | 19.5       |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 1700      |\n",
            "|    time_elapsed       | 141       |\n",
            "|    total_timesteps    | 8500      |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41       |\n",
            "|    explained_variance | 0.203     |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 1699      |\n",
            "|    policy_loss        | 37        |\n",
            "|    reward             | 1.2727017 |\n",
            "|    std                | 0.996     |\n",
            "|    value_loss         | 3.2       |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 60        |\n",
            "|    iterations         | 1800      |\n",
            "|    time_elapsed       | 148       |\n",
            "|    total_timesteps    | 9000      |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41       |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 1799      |\n",
            "|    policy_loss        | 80.9      |\n",
            "|    reward             | 3.9352329 |\n",
            "|    std                | 0.996     |\n",
            "|    value_loss         | 9.56      |\n",
            "-------------------------------------\n",
            "------------------------------------\n",
            "| time/                 |          |\n",
            "|    fps                | 60       |\n",
            "|    iterations         | 1900     |\n",
            "|    time_elapsed       | 156      |\n",
            "|    total_timesteps    | 9500     |\n",
            "| train/                |          |\n",
            "|    entropy_loss       | -41      |\n",
            "|    explained_variance | 0        |\n",
            "|    learning_rate      | 0.0007   |\n",
            "|    n_updates          | 1899     |\n",
            "|    policy_loss        | 507      |\n",
            "|    reward             | 6.328624 |\n",
            "|    std                | 0.997    |\n",
            "|    value_loss         | 187      |\n",
            "------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 60         |\n",
            "|    iterations         | 2000       |\n",
            "|    time_elapsed       | 166        |\n",
            "|    total_timesteps    | 10000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.1      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 1999       |\n",
            "|    policy_loss        | -126       |\n",
            "|    reward             | -2.8187668 |\n",
            "|    std                | 1          |\n",
            "|    value_loss         | 8.37       |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 60        |\n",
            "|    iterations         | 2100      |\n",
            "|    time_elapsed       | 172       |\n",
            "|    total_timesteps    | 10500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.1     |\n",
            "|    explained_variance | -1.19e-07 |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 2099      |\n",
            "|    policy_loss        | -58.5     |\n",
            "|    reward             | 0.2734109 |\n",
            "|    std                | 0.999     |\n",
            "|    value_loss         | 3         |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 60         |\n",
            "|    iterations         | 2200       |\n",
            "|    time_elapsed       | 180        |\n",
            "|    total_timesteps    | 11000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.1      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 2199       |\n",
            "|    policy_loss        | 157        |\n",
            "|    reward             | 0.68144214 |\n",
            "|    std                | 0.997      |\n",
            "|    value_loss         | 19.9       |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 60         |\n",
            "|    iterations         | 2300       |\n",
            "|    time_elapsed       | 190        |\n",
            "|    total_timesteps    | 11500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.1      |\n",
            "|    explained_variance | 1.19e-07   |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 2299       |\n",
            "|    policy_loss        | -67.3      |\n",
            "|    reward             | -1.8721669 |\n",
            "|    std                | 0.999      |\n",
            "|    value_loss         | 2.49       |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 2400       |\n",
            "|    time_elapsed       | 196        |\n",
            "|    total_timesteps    | 12000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.1      |\n",
            "|    explained_variance | -0.0105    |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 2399       |\n",
            "|    policy_loss        | 144        |\n",
            "|    reward             | 0.47134838 |\n",
            "|    std                | 0.997      |\n",
            "|    value_loss         | 26.3       |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 60         |\n",
            "|    iterations         | 2500       |\n",
            "|    time_elapsed       | 204        |\n",
            "|    total_timesteps    | 12500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.1      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 2499       |\n",
            "|    policy_loss        | 589        |\n",
            "|    reward             | -1.9081986 |\n",
            "|    std                | 0.997      |\n",
            "|    value_loss         | 221        |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 60        |\n",
            "|    iterations         | 2600      |\n",
            "|    time_elapsed       | 213       |\n",
            "|    total_timesteps    | 13000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41       |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 2599      |\n",
            "|    policy_loss        | 352       |\n",
            "|    reward             | 6.1386447 |\n",
            "|    std                | 0.996     |\n",
            "|    value_loss         | 148       |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 2700      |\n",
            "|    time_elapsed       | 219       |\n",
            "|    total_timesteps    | 13500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41       |\n",
            "|    explained_variance | -0.0134   |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 2699      |\n",
            "|    policy_loss        | -131      |\n",
            "|    reward             | 0.6143146 |\n",
            "|    std                | 0.995     |\n",
            "|    value_loss         | 12.3      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 2800      |\n",
            "|    time_elapsed       | 229       |\n",
            "|    total_timesteps    | 14000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41       |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 2799      |\n",
            "|    policy_loss        | 132       |\n",
            "|    reward             | 1.7656372 |\n",
            "|    std                | 0.995     |\n",
            "|    value_loss         | 13.2      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 2900       |\n",
            "|    time_elapsed       | 237        |\n",
            "|    total_timesteps    | 14500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41        |\n",
            "|    explained_variance | -0.0245    |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 2899       |\n",
            "|    policy_loss        | 17.1       |\n",
            "|    reward             | 0.08867768 |\n",
            "|    std                | 0.995      |\n",
            "|    value_loss         | 3.2        |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 3000       |\n",
            "|    time_elapsed       | 243        |\n",
            "|    total_timesteps    | 15000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41        |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 2999       |\n",
            "|    policy_loss        | 48.4       |\n",
            "|    reward             | -3.6771903 |\n",
            "|    std                | 0.995      |\n",
            "|    value_loss         | 2.24       |\n",
            "--------------------------------------\n",
            "------------------------------------\n",
            "| time/                 |          |\n",
            "|    fps                | 61       |\n",
            "|    iterations         | 3100     |\n",
            "|    time_elapsed       | 253      |\n",
            "|    total_timesteps    | 15500    |\n",
            "| train/                |          |\n",
            "|    entropy_loss       | -41.1    |\n",
            "|    explained_variance | 1.19e-07 |\n",
            "|    learning_rate      | 0.0007   |\n",
            "|    n_updates          | 3099     |\n",
            "|    policy_loss        | -0.565   |\n",
            "|    reward             | 0.679106 |\n",
            "|    std                | 0.998    |\n",
            "|    value_loss         | 1.98     |\n",
            "------------------------------------\n",
            "----------------------------------------\n",
            "| time/                 |              |\n",
            "|    fps                | 61           |\n",
            "|    iterations         | 3200         |\n",
            "|    time_elapsed       | 260          |\n",
            "|    total_timesteps    | 16000        |\n",
            "| train/                |              |\n",
            "|    entropy_loss       | -41.1        |\n",
            "|    explained_variance | 0            |\n",
            "|    learning_rate      | 0.0007       |\n",
            "|    n_updates          | 3199         |\n",
            "|    policy_loss        | 26.6         |\n",
            "|    reward             | 0.0013427841 |\n",
            "|    std                | 0.998        |\n",
            "|    value_loss         | 11.3         |\n",
            "----------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 3300      |\n",
            "|    time_elapsed       | 267       |\n",
            "|    total_timesteps    | 16500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.1     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 3299      |\n",
            "|    policy_loss        | 54.6      |\n",
            "|    reward             | 1.6012005 |\n",
            "|    std                | 0.998     |\n",
            "|    value_loss         | 7.47      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 3400      |\n",
            "|    time_elapsed       | 277       |\n",
            "|    total_timesteps    | 17000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.1     |\n",
            "|    explained_variance | -0.0366   |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 3399      |\n",
            "|    policy_loss        | -33.7     |\n",
            "|    reward             | 0.8685799 |\n",
            "|    std                | 1         |\n",
            "|    value_loss         | 3.79      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 3500       |\n",
            "|    time_elapsed       | 284        |\n",
            "|    total_timesteps    | 17500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.1      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 3499       |\n",
            "|    policy_loss        | 21         |\n",
            "|    reward             | 0.14613488 |\n",
            "|    std                | 1          |\n",
            "|    value_loss         | 1.32       |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 3600       |\n",
            "|    time_elapsed       | 291        |\n",
            "|    total_timesteps    | 18000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.1      |\n",
            "|    explained_variance | 5.96e-08   |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 3599       |\n",
            "|    policy_loss        | -173       |\n",
            "|    reward             | -1.2669375 |\n",
            "|    std                | 1          |\n",
            "|    value_loss         | 20.9       |\n",
            "--------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                 |               |\n",
            "|    fps                | 61            |\n",
            "|    iterations         | 3700          |\n",
            "|    time_elapsed       | 301           |\n",
            "|    total_timesteps    | 18500         |\n",
            "| train/                |               |\n",
            "|    entropy_loss       | -41.2         |\n",
            "|    explained_variance | 0             |\n",
            "|    learning_rate      | 0.0007        |\n",
            "|    n_updates          | 3699          |\n",
            "|    policy_loss        | -53.5         |\n",
            "|    reward             | -0.0045535835 |\n",
            "|    std                | 1             |\n",
            "|    value_loss         | 8.58          |\n",
            "-----------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 3800      |\n",
            "|    time_elapsed       | 308       |\n",
            "|    total_timesteps    | 19000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.1     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 3799      |\n",
            "|    policy_loss        | 98.2      |\n",
            "|    reward             | 0.6932831 |\n",
            "|    std                | 1         |\n",
            "|    value_loss         | 7.95      |\n",
            "-------------------------------------\n",
            "------------------------------------\n",
            "| time/                 |          |\n",
            "|    fps                | 61       |\n",
            "|    iterations         | 3900     |\n",
            "|    time_elapsed       | 316      |\n",
            "|    total_timesteps    | 19500    |\n",
            "| train/                |          |\n",
            "|    entropy_loss       | -41.1    |\n",
            "|    explained_variance | 0        |\n",
            "|    learning_rate      | 0.0007   |\n",
            "|    n_updates          | 3899     |\n",
            "|    policy_loss        | 628      |\n",
            "|    reward             | 8.699067 |\n",
            "|    std                | 1        |\n",
            "|    value_loss         | 335      |\n",
            "------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 4000      |\n",
            "|    time_elapsed       | 325       |\n",
            "|    total_timesteps    | 20000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.1     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 3999      |\n",
            "|    policy_loss        | 605       |\n",
            "|    reward             | 17.186195 |\n",
            "|    std                | 1         |\n",
            "|    value_loss         | 266       |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 4100      |\n",
            "|    time_elapsed       | 331       |\n",
            "|    total_timesteps    | 20500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.1     |\n",
            "|    explained_variance | 0.0757    |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 4099      |\n",
            "|    policy_loss        | 187       |\n",
            "|    reward             | 1.1687305 |\n",
            "|    std                | 1         |\n",
            "|    value_loss         | 22.2      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 4200       |\n",
            "|    time_elapsed       | 341        |\n",
            "|    total_timesteps    | 21000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.1      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 4199       |\n",
            "|    policy_loss        | -7.42      |\n",
            "|    reward             | -0.5722446 |\n",
            "|    std                | 1          |\n",
            "|    value_loss         | 0.248      |\n",
            "--------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 61          |\n",
            "|    iterations         | 4300        |\n",
            "|    time_elapsed       | 351         |\n",
            "|    total_timesteps    | 21500       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41.2       |\n",
            "|    explained_variance | 0           |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 4299        |\n",
            "|    policy_loss        | 44.7        |\n",
            "|    reward             | -0.39718863 |\n",
            "|    std                | 1           |\n",
            "|    value_loss         | 2.09        |\n",
            "---------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 61          |\n",
            "|    iterations         | 4400        |\n",
            "|    time_elapsed       | 359         |\n",
            "|    total_timesteps    | 22000       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41.2       |\n",
            "|    explained_variance | 0           |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 4399        |\n",
            "|    policy_loss        | -29.9       |\n",
            "|    reward             | -0.07916131 |\n",
            "|    std                | 1           |\n",
            "|    value_loss         | 0.947       |\n",
            "---------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 60        |\n",
            "|    iterations         | 4500      |\n",
            "|    time_elapsed       | 369       |\n",
            "|    total_timesteps    | 22500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.2     |\n",
            "|    explained_variance | 5.96e-08  |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 4499      |\n",
            "|    policy_loss        | -28.5     |\n",
            "|    reward             | 1.8631558 |\n",
            "|    std                | 1         |\n",
            "|    value_loss         | 1.83      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 4600      |\n",
            "|    time_elapsed       | 375       |\n",
            "|    total_timesteps    | 23000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.1     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 4599      |\n",
            "|    policy_loss        | 58.4      |\n",
            "|    reward             | 1.4081724 |\n",
            "|    std                | 0.999     |\n",
            "|    value_loss         | 77        |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 4700       |\n",
            "|    time_elapsed       | 383        |\n",
            "|    total_timesteps    | 23500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.1      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 4699       |\n",
            "|    policy_loss        | 79.9       |\n",
            "|    reward             | -2.0728068 |\n",
            "|    std                | 0.998      |\n",
            "|    value_loss         | 4.66       |\n",
            "--------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 61          |\n",
            "|    iterations         | 4800        |\n",
            "|    time_elapsed       | 392         |\n",
            "|    total_timesteps    | 24000       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41.1       |\n",
            "|    explained_variance | 0.0901      |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 4799        |\n",
            "|    policy_loss        | -15.9       |\n",
            "|    reward             | -0.09403604 |\n",
            "|    std                | 0.998       |\n",
            "|    value_loss         | 0.165       |\n",
            "---------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 4900      |\n",
            "|    time_elapsed       | 399       |\n",
            "|    total_timesteps    | 24500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.1     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 4899      |\n",
            "|    policy_loss        | -110      |\n",
            "|    reward             | 2.0542228 |\n",
            "|    std                | 1         |\n",
            "|    value_loss         | 10.8      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 5000      |\n",
            "|    time_elapsed       | 407       |\n",
            "|    total_timesteps    | 25000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.1     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 4999      |\n",
            "|    policy_loss        | -186      |\n",
            "|    reward             | 2.1355224 |\n",
            "|    std                | 0.999     |\n",
            "|    value_loss         | 27.6      |\n",
            "-------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 61          |\n",
            "|    iterations         | 5100        |\n",
            "|    time_elapsed       | 416         |\n",
            "|    total_timesteps    | 25500       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41.1       |\n",
            "|    explained_variance | 0           |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 5099        |\n",
            "|    policy_loss        | -13.1       |\n",
            "|    reward             | -0.08651471 |\n",
            "|    std                | 0.999       |\n",
            "|    value_loss         | 1.59        |\n",
            "---------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 5200      |\n",
            "|    time_elapsed       | 423       |\n",
            "|    total_timesteps    | 26000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41       |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 5199      |\n",
            "|    policy_loss        | 99        |\n",
            "|    reward             | 2.3819537 |\n",
            "|    std                | 0.997     |\n",
            "|    value_loss         | 29.5      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 5300       |\n",
            "|    time_elapsed       | 432        |\n",
            "|    total_timesteps    | 26500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41        |\n",
            "|    explained_variance | 1.19e-07   |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 5299       |\n",
            "|    policy_loss        | 372        |\n",
            "|    reward             | -22.664398 |\n",
            "|    std                | 0.997      |\n",
            "|    value_loss         | 196        |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 5400      |\n",
            "|    time_elapsed       | 440       |\n",
            "|    total_timesteps    | 27000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41       |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 5399      |\n",
            "|    policy_loss        | 153       |\n",
            "|    reward             | 1.1176498 |\n",
            "|    std                | 0.995     |\n",
            "|    value_loss         | 15.5      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 5500       |\n",
            "|    time_elapsed       | 446        |\n",
            "|    total_timesteps    | 27500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41        |\n",
            "|    explained_variance | -1.19e-07  |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 5499       |\n",
            "|    policy_loss        | -92.6      |\n",
            "|    reward             | -5.1304746 |\n",
            "|    std                | 0.996      |\n",
            "|    value_loss         | 14.5       |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 5600      |\n",
            "|    time_elapsed       | 456       |\n",
            "|    total_timesteps    | 28000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41       |\n",
            "|    explained_variance | 0.0186    |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 5599      |\n",
            "|    policy_loss        | 62.9      |\n",
            "|    reward             | 1.1683302 |\n",
            "|    std                | 0.996     |\n",
            "|    value_loss         | 9.46      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 5700       |\n",
            "|    time_elapsed       | 464        |\n",
            "|    total_timesteps    | 28500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41        |\n",
            "|    explained_variance | -1.19e-07  |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 5699       |\n",
            "|    policy_loss        | 34.4       |\n",
            "|    reward             | -3.4618378 |\n",
            "|    std                | 0.995      |\n",
            "|    value_loss         | 12         |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 5800      |\n",
            "|    time_elapsed       | 471       |\n",
            "|    total_timesteps    | 29000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41       |\n",
            "|    explained_variance | -0.247    |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 5799      |\n",
            "|    policy_loss        | -171      |\n",
            "|    reward             | 5.8363895 |\n",
            "|    std                | 0.996     |\n",
            "|    value_loss         | 20.7      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 5900       |\n",
            "|    time_elapsed       | 481        |\n",
            "|    total_timesteps    | 29500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41        |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 5899       |\n",
            "|    policy_loss        | 250        |\n",
            "|    reward             | -4.7651134 |\n",
            "|    std                | 0.996      |\n",
            "|    value_loss         | 83.8       |\n",
            "--------------------------------------\n",
            "------------------------------------\n",
            "| time/                 |          |\n",
            "|    fps                | 61       |\n",
            "|    iterations         | 6000     |\n",
            "|    time_elapsed       | 487      |\n",
            "|    total_timesteps    | 30000    |\n",
            "| train/                |          |\n",
            "|    entropy_loss       | -41.1    |\n",
            "|    explained_variance | 0        |\n",
            "|    learning_rate      | 0.0007   |\n",
            "|    n_updates          | 5999     |\n",
            "|    policy_loss        | -211     |\n",
            "|    reward             | 3.167639 |\n",
            "|    std                | 0.999    |\n",
            "|    value_loss         | 95.9     |\n",
            "------------------------------------\n",
            "day: 3352, episode: 10\n",
            "begin_total_asset: 1005653.74\n",
            "end_total_asset: 8785262.38\n",
            "total_reward: 7779608.65\n",
            "total_cost: 31168.83\n",
            "total_trades: 45054\n",
            "Sharpe: 0.834\n",
            "=================================\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 6100       |\n",
            "|    time_elapsed       | 495        |\n",
            "|    total_timesteps    | 30500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.1      |\n",
            "|    explained_variance | -0.113     |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 6099       |\n",
            "|    policy_loss        | 21.6       |\n",
            "|    reward             | 0.12259659 |\n",
            "|    std                | 1          |\n",
            "|    value_loss         | 0.918      |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 6200      |\n",
            "|    time_elapsed       | 505       |\n",
            "|    total_timesteps    | 31000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.1     |\n",
            "|    explained_variance | -0.00301  |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 6199      |\n",
            "|    policy_loss        | -126      |\n",
            "|    reward             | 1.5182142 |\n",
            "|    std                | 1         |\n",
            "|    value_loss         | 16.8      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 6300      |\n",
            "|    time_elapsed       | 511       |\n",
            "|    total_timesteps    | 31500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.2     |\n",
            "|    explained_variance | -1.74     |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 6299      |\n",
            "|    policy_loss        | -25.1     |\n",
            "|    reward             | 0.5284405 |\n",
            "|    std                | 1         |\n",
            "|    value_loss         | 1.56      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 6400       |\n",
            "|    time_elapsed       | 519        |\n",
            "|    total_timesteps    | 32000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.3      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 6399       |\n",
            "|    policy_loss        | 123        |\n",
            "|    reward             | -3.4704874 |\n",
            "|    std                | 1.01       |\n",
            "|    value_loss         | 14.2       |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 6500      |\n",
            "|    time_elapsed       | 528       |\n",
            "|    total_timesteps    | 32500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.4     |\n",
            "|    explained_variance | -1.19e-07 |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 6499      |\n",
            "|    policy_loss        | -102      |\n",
            "|    reward             | 3.8022645 |\n",
            "|    std                | 1.01      |\n",
            "|    value_loss         | 11        |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 6600      |\n",
            "|    time_elapsed       | 535       |\n",
            "|    total_timesteps    | 33000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.4     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 6599      |\n",
            "|    policy_loss        | 372       |\n",
            "|    reward             | 24.101572 |\n",
            "|    std                | 1.01      |\n",
            "|    value_loss         | 136       |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 6700       |\n",
            "|    time_elapsed       | 543        |\n",
            "|    total_timesteps    | 33500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.4      |\n",
            "|    explained_variance | 5.96e-08   |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 6699       |\n",
            "|    policy_loss        | -923       |\n",
            "|    reward             | -11.455252 |\n",
            "|    std                | 1.01       |\n",
            "|    value_loss         | 474        |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 6800       |\n",
            "|    time_elapsed       | 552        |\n",
            "|    total_timesteps    | 34000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.4      |\n",
            "|    explained_variance | -0.553     |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 6799       |\n",
            "|    policy_loss        | -111       |\n",
            "|    reward             | 0.10300914 |\n",
            "|    std                | 1.01       |\n",
            "|    value_loss         | 9.3        |\n",
            "--------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 61          |\n",
            "|    iterations         | 6900        |\n",
            "|    time_elapsed       | 558         |\n",
            "|    total_timesteps    | 34500       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41.4       |\n",
            "|    explained_variance | 0           |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 6899        |\n",
            "|    policy_loss        | 133         |\n",
            "|    reward             | -0.71322364 |\n",
            "|    std                | 1.01        |\n",
            "|    value_loss         | 15.3        |\n",
            "---------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 7000      |\n",
            "|    time_elapsed       | 567       |\n",
            "|    total_timesteps    | 35000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.4     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 6999      |\n",
            "|    policy_loss        | -84.3     |\n",
            "|    reward             | 2.3256698 |\n",
            "|    std                | 1.01      |\n",
            "|    value_loss         | 4.58      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 7100      |\n",
            "|    time_elapsed       | 575       |\n",
            "|    total_timesteps    | 35500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.4     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 7099      |\n",
            "|    policy_loss        | -2.44     |\n",
            "|    reward             | 0.9263134 |\n",
            "|    std                | 1.01      |\n",
            "|    value_loss         | 1.21      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 7200       |\n",
            "|    time_elapsed       | 586        |\n",
            "|    total_timesteps    | 36000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.5      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 7199       |\n",
            "|    policy_loss        | -228       |\n",
            "|    reward             | -1.9283981 |\n",
            "|    std                | 1.01       |\n",
            "|    value_loss         | 42         |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 7300      |\n",
            "|    time_elapsed       | 596       |\n",
            "|    total_timesteps    | 36500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.5     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 7299      |\n",
            "|    policy_loss        | 81        |\n",
            "|    reward             | -6.168546 |\n",
            "|    std                | 1.01      |\n",
            "|    value_loss         | 8.16      |\n",
            "-------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 61          |\n",
            "|    iterations         | 7400        |\n",
            "|    time_elapsed       | 602         |\n",
            "|    total_timesteps    | 37000       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41.5       |\n",
            "|    explained_variance | -1.19e-07   |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 7399        |\n",
            "|    policy_loss        | -136        |\n",
            "|    reward             | -0.66517484 |\n",
            "|    std                | 1.02        |\n",
            "|    value_loss         | 12.4        |\n",
            "---------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 7500       |\n",
            "|    time_elapsed       | 610        |\n",
            "|    total_timesteps    | 37500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.6      |\n",
            "|    explained_variance | 1.44e-05   |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 7499       |\n",
            "|    policy_loss        | -368       |\n",
            "|    reward             | 0.28679553 |\n",
            "|    std                | 1.02       |\n",
            "|    value_loss         | 81.7       |\n",
            "--------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 61          |\n",
            "|    iterations         | 7600        |\n",
            "|    time_elapsed       | 620         |\n",
            "|    total_timesteps    | 38000       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41.6       |\n",
            "|    explained_variance | 0           |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 7599        |\n",
            "|    policy_loss        | -80.4       |\n",
            "|    reward             | -0.02342434 |\n",
            "|    std                | 1.02        |\n",
            "|    value_loss         | 4.71        |\n",
            "---------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 7700      |\n",
            "|    time_elapsed       | 626       |\n",
            "|    total_timesteps    | 38500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.6     |\n",
            "|    explained_variance | -3.11     |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 7699      |\n",
            "|    policy_loss        | -91.7     |\n",
            "|    reward             | 2.6142132 |\n",
            "|    std                | 1.02      |\n",
            "|    value_loss         | 6.61      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 7800       |\n",
            "|    time_elapsed       | 635        |\n",
            "|    total_timesteps    | 39000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.6      |\n",
            "|    explained_variance | 0.125      |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 7799       |\n",
            "|    policy_loss        | 4.76       |\n",
            "|    reward             | -1.2840562 |\n",
            "|    std                | 1.02       |\n",
            "|    value_loss         | 0.762      |\n",
            "--------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 61          |\n",
            "|    iterations         | 7900        |\n",
            "|    time_elapsed       | 644         |\n",
            "|    total_timesteps    | 39500       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41.7       |\n",
            "|    explained_variance | 0.0476      |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 7899        |\n",
            "|    policy_loss        | 273         |\n",
            "|    reward             | -0.55217224 |\n",
            "|    std                | 1.02        |\n",
            "|    value_loss         | 46.2        |\n",
            "---------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 8000      |\n",
            "|    time_elapsed       | 650       |\n",
            "|    total_timesteps    | 40000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.6     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 7999      |\n",
            "|    policy_loss        | -769      |\n",
            "|    reward             | 1.6622137 |\n",
            "|    std                | 1.02      |\n",
            "|    value_loss         | 367       |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 8100       |\n",
            "|    time_elapsed       | 660        |\n",
            "|    total_timesteps    | 40500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.6      |\n",
            "|    explained_variance | -0.131     |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 8099       |\n",
            "|    policy_loss        | 38.2       |\n",
            "|    reward             | 0.38162667 |\n",
            "|    std                | 1.02       |\n",
            "|    value_loss         | 1.09       |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 8200      |\n",
            "|    time_elapsed       | 668       |\n",
            "|    total_timesteps    | 41000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.6     |\n",
            "|    explained_variance | -0.306    |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 8199      |\n",
            "|    policy_loss        | 40.8      |\n",
            "|    reward             | 0.8386523 |\n",
            "|    std                | 1.02      |\n",
            "|    value_loss         | 3.26      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 8300      |\n",
            "|    time_elapsed       | 674       |\n",
            "|    total_timesteps    | 41500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.5     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 8299      |\n",
            "|    policy_loss        | 41.7      |\n",
            "|    reward             | 1.5822707 |\n",
            "|    std                | 1.02      |\n",
            "|    value_loss         | 4.75      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 8400      |\n",
            "|    time_elapsed       | 684       |\n",
            "|    total_timesteps    | 42000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.6     |\n",
            "|    explained_variance | 5.96e-08  |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 8399      |\n",
            "|    policy_loss        | 133       |\n",
            "|    reward             | 0.1792632 |\n",
            "|    std                | 1.02      |\n",
            "|    value_loss         | 15.6      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 8500      |\n",
            "|    time_elapsed       | 691       |\n",
            "|    total_timesteps    | 42500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.7     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 8499      |\n",
            "|    policy_loss        | 99.2      |\n",
            "|    reward             | 1.6896911 |\n",
            "|    std                | 1.02      |\n",
            "|    value_loss         | 33.2      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 8600      |\n",
            "|    time_elapsed       | 699       |\n",
            "|    total_timesteps    | 43000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.7     |\n",
            "|    explained_variance | -0.0163   |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 8599      |\n",
            "|    policy_loss        | -836      |\n",
            "|    reward             | 30.580954 |\n",
            "|    std                | 1.02      |\n",
            "|    value_loss         | 436       |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 8700      |\n",
            "|    time_elapsed       | 709       |\n",
            "|    total_timesteps    | 43500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.7     |\n",
            "|    explained_variance | 0.0669    |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 8699      |\n",
            "|    policy_loss        | -430      |\n",
            "|    reward             | -9.169519 |\n",
            "|    std                | 1.02      |\n",
            "|    value_loss         | 186       |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 8800      |\n",
            "|    time_elapsed       | 715       |\n",
            "|    total_timesteps    | 44000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.7     |\n",
            "|    explained_variance | 0.178     |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 8799      |\n",
            "|    policy_loss        | -2.39     |\n",
            "|    reward             | -0.505542 |\n",
            "|    std                | 1.02      |\n",
            "|    value_loss         | 0.0762    |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 8900      |\n",
            "|    time_elapsed       | 723       |\n",
            "|    total_timesteps    | 44500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.7     |\n",
            "|    explained_variance | 0.0519    |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 8899      |\n",
            "|    policy_loss        | -29       |\n",
            "|    reward             | 1.4009765 |\n",
            "|    std                | 1.02      |\n",
            "|    value_loss         | 0.617     |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 9000       |\n",
            "|    time_elapsed       | 733        |\n",
            "|    total_timesteps    | 45000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.8      |\n",
            "|    explained_variance | 0.0464     |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 8999       |\n",
            "|    policy_loss        | -142       |\n",
            "|    reward             | -1.6482956 |\n",
            "|    std                | 1.02       |\n",
            "|    value_loss         | 12.1       |\n",
            "--------------------------------------\n",
            "----------------------------------------\n",
            "| time/                 |              |\n",
            "|    fps                | 61           |\n",
            "|    iterations         | 9100         |\n",
            "|    time_elapsed       | 739          |\n",
            "|    total_timesteps    | 45500        |\n",
            "| train/                |              |\n",
            "|    entropy_loss       | -41.7        |\n",
            "|    explained_variance | 0.128        |\n",
            "|    learning_rate      | 0.0007       |\n",
            "|    n_updates          | 9099         |\n",
            "|    policy_loss        | -18.8        |\n",
            "|    reward             | -0.022230674 |\n",
            "|    std                | 1.02         |\n",
            "|    value_loss         | 1.08         |\n",
            "----------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 61          |\n",
            "|    iterations         | 9200        |\n",
            "|    time_elapsed       | 748         |\n",
            "|    total_timesteps    | 46000       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41.8       |\n",
            "|    explained_variance | 0           |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 9199        |\n",
            "|    policy_loss        | 35.8        |\n",
            "|    reward             | -0.16132466 |\n",
            "|    std                | 1.02        |\n",
            "|    value_loss         | 6.5         |\n",
            "---------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 9300       |\n",
            "|    time_elapsed       | 757        |\n",
            "|    total_timesteps    | 46500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.7      |\n",
            "|    explained_variance | 0.00416    |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 9299       |\n",
            "|    policy_loss        | -192       |\n",
            "|    reward             | -3.3674068 |\n",
            "|    std                | 1.02       |\n",
            "|    value_loss         | 87.7       |\n",
            "--------------------------------------\n",
            "------------------------------------\n",
            "| time/                 |          |\n",
            "|    fps                | 61       |\n",
            "|    iterations         | 9400     |\n",
            "|    time_elapsed       | 763      |\n",
            "|    total_timesteps    | 47000    |\n",
            "| train/                |          |\n",
            "|    entropy_loss       | -41.8    |\n",
            "|    explained_variance | -0.0393  |\n",
            "|    learning_rate      | 0.0007   |\n",
            "|    n_updates          | 9399     |\n",
            "|    policy_loss        | -37.6    |\n",
            "|    reward             | 1.150722 |\n",
            "|    std                | 1.02     |\n",
            "|    value_loss         | 3.37     |\n",
            "------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 9500       |\n",
            "|    time_elapsed       | 772        |\n",
            "|    total_timesteps    | 47500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.8      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 9499       |\n",
            "|    policy_loss        | -25.2      |\n",
            "|    reward             | 0.41208658 |\n",
            "|    std                | 1.02       |\n",
            "|    value_loss         | 0.698      |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 9600      |\n",
            "|    time_elapsed       | 780       |\n",
            "|    total_timesteps    | 48000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.9     |\n",
            "|    explained_variance | -1.19e-07 |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 9599      |\n",
            "|    policy_loss        | 9.9       |\n",
            "|    reward             | 0.5765088 |\n",
            "|    std                | 1.03      |\n",
            "|    value_loss         | 2.06      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 9700       |\n",
            "|    time_elapsed       | 787        |\n",
            "|    total_timesteps    | 48500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.9      |\n",
            "|    explained_variance | -1.19e-07  |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 9699       |\n",
            "|    policy_loss        | 315        |\n",
            "|    reward             | -0.2841707 |\n",
            "|    std                | 1.03       |\n",
            "|    value_loss         | 49.4       |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 9800      |\n",
            "|    time_elapsed       | 797       |\n",
            "|    total_timesteps    | 49000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.9     |\n",
            "|    explained_variance | 5.96e-08  |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 9799      |\n",
            "|    policy_loss        | 125       |\n",
            "|    reward             | 0.6355639 |\n",
            "|    std                | 1.03      |\n",
            "|    value_loss         | 9.9       |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 61         |\n",
            "|    iterations         | 9900       |\n",
            "|    time_elapsed       | 804        |\n",
            "|    total_timesteps    | 49500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.9      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 9899       |\n",
            "|    policy_loss        | 155        |\n",
            "|    reward             | -4.6037025 |\n",
            "|    std                | 1.03       |\n",
            "|    value_loss         | 16.2       |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 61        |\n",
            "|    iterations         | 10000     |\n",
            "|    time_elapsed       | 811       |\n",
            "|    total_timesteps    | 50000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.9     |\n",
            "|    explained_variance | -1.19e-07 |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 9999      |\n",
            "|    policy_loss        | 104       |\n",
            "|    reward             | -3.306132 |\n",
            "|    std                | 1.03      |\n",
            "|    value_loss         | 9.5       |\n",
            "-------------------------------------\n",
            "{'batch_size': 128, 'buffer_size': 50000, 'learning_rate': 0.001}\n",
            "Using cpu device\n",
            "Logging to results/ddpg\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 4          |\n",
            "|    fps             | 24         |\n",
            "|    time_elapsed    | 547        |\n",
            "|    total_timesteps | 13412      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 12.5       |\n",
            "|    critic_loss     | 295        |\n",
            "|    learning_rate   | 0.001      |\n",
            "|    n_updates       | 10059      |\n",
            "|    reward          | -6.3763723 |\n",
            "-----------------------------------\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 8          |\n",
            "|    fps             | 21         |\n",
            "|    time_elapsed    | 1237       |\n",
            "|    total_timesteps | 26824      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | -5.51      |\n",
            "|    critic_loss     | 14.7       |\n",
            "|    learning_rate   | 0.001      |\n",
            "|    n_updates       | 23471      |\n",
            "|    reward          | -6.3763723 |\n",
            "-----------------------------------\n",
            "day: 3352, episode: 10\n",
            "begin_total_asset: 1011382.29\n",
            "end_total_asset: 6058882.63\n",
            "total_reward: 5047500.34\n",
            "total_cost: 1010.37\n",
            "total_trades: 46928\n",
            "Sharpe: 0.807\n",
            "=================================\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 12         |\n",
            "|    fps             | 20         |\n",
            "|    time_elapsed    | 1942       |\n",
            "|    total_timesteps | 40236      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | -8.67      |\n",
            "|    critic_loss     | 7.57       |\n",
            "|    learning_rate   | 0.001      |\n",
            "|    n_updates       | 36883      |\n",
            "|    reward          | -6.3763723 |\n",
            "-----------------------------------\n",
            "{'n_steps': 2048, 'ent_coef': 0.01, 'learning_rate': 0.00025, 'batch_size': 128}\n",
            "Using cpu device\n",
            "Logging to results/ppo\n",
            "------------------------------------\n",
            "| time/              |             |\n",
            "|    fps             | 60          |\n",
            "|    iterations      | 1           |\n",
            "|    time_elapsed    | 33          |\n",
            "|    total_timesteps | 2048        |\n",
            "| train/             |             |\n",
            "|    reward          | -0.20400214 |\n",
            "------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 61          |\n",
            "|    iterations           | 2           |\n",
            "|    time_elapsed         | 66          |\n",
            "|    total_timesteps      | 4096        |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.019446293 |\n",
            "|    clip_fraction        | 0.218       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.2       |\n",
            "|    explained_variance   | -0.0153     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 8.32        |\n",
            "|    n_updates            | 10          |\n",
            "|    policy_gradient_loss | -0.0239     |\n",
            "|    reward               | 0.964798    |\n",
            "|    std                  | 1           |\n",
            "|    value_loss           | 12.1        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 60          |\n",
            "|    iterations           | 3           |\n",
            "|    time_elapsed         | 100         |\n",
            "|    total_timesteps      | 6144        |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.017402954 |\n",
            "|    clip_fraction        | 0.182       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.2       |\n",
            "|    explained_variance   | 0.000782    |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 28.3        |\n",
            "|    n_updates            | 20          |\n",
            "|    policy_gradient_loss | -0.0164     |\n",
            "|    reward               | 7.5384216   |\n",
            "|    std                  | 1           |\n",
            "|    value_loss           | 51.3        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 62          |\n",
            "|    iterations           | 4           |\n",
            "|    time_elapsed         | 131         |\n",
            "|    total_timesteps      | 8192        |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.015737543 |\n",
            "|    clip_fraction        | 0.162       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.3       |\n",
            "|    explained_variance   | -0.0037     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 22.7        |\n",
            "|    n_updates            | 30          |\n",
            "|    policy_gradient_loss | -0.0225     |\n",
            "|    reward               | 2.27421     |\n",
            "|    std                  | 1.01        |\n",
            "|    value_loss           | 38.2        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 62          |\n",
            "|    iterations           | 5           |\n",
            "|    time_elapsed         | 165         |\n",
            "|    total_timesteps      | 10240       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.020310912 |\n",
            "|    clip_fraction        | 0.184       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.3       |\n",
            "|    explained_variance   | -0.00721    |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 12          |\n",
            "|    n_updates            | 40          |\n",
            "|    policy_gradient_loss | -0.0202     |\n",
            "|    reward               | 0.7753585   |\n",
            "|    std                  | 1.01        |\n",
            "|    value_loss           | 21.3        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 62          |\n",
            "|    iterations           | 6           |\n",
            "|    time_elapsed         | 196         |\n",
            "|    total_timesteps      | 12288       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.014960194 |\n",
            "|    clip_fraction        | 0.143       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.4       |\n",
            "|    explained_variance   | -0.0299     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 30.5        |\n",
            "|    n_updates            | 50          |\n",
            "|    policy_gradient_loss | -0.0179     |\n",
            "|    reward               | 2.62347     |\n",
            "|    std                  | 1.01        |\n",
            "|    value_loss           | 46.2        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 62          |\n",
            "|    iterations           | 7           |\n",
            "|    time_elapsed         | 228         |\n",
            "|    total_timesteps      | 14336       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.023127541 |\n",
            "|    clip_fraction        | 0.193       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.4       |\n",
            "|    explained_variance   | -0.023      |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 6.36        |\n",
            "|    n_updates            | 60          |\n",
            "|    policy_gradient_loss | -0.0221     |\n",
            "|    reward               | 1.0379714   |\n",
            "|    std                  | 1.01        |\n",
            "|    value_loss           | 14.1        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 62          |\n",
            "|    iterations           | 8           |\n",
            "|    time_elapsed         | 260         |\n",
            "|    total_timesteps      | 16384       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.018745095 |\n",
            "|    clip_fraction        | 0.201       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.5       |\n",
            "|    explained_variance   | 0.00254     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 18.3        |\n",
            "|    n_updates            | 70          |\n",
            "|    policy_gradient_loss | -0.019      |\n",
            "|    reward               | -0.21705139 |\n",
            "|    std                  | 1.01        |\n",
            "|    value_loss           | 59.5        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 62          |\n",
            "|    iterations           | 9           |\n",
            "|    time_elapsed         | 294         |\n",
            "|    total_timesteps      | 18432       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.018167643 |\n",
            "|    clip_fraction        | 0.154       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.5       |\n",
            "|    explained_variance   | -0.000664   |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 21.5        |\n",
            "|    n_updates            | 80          |\n",
            "|    policy_gradient_loss | -0.0144     |\n",
            "|    reward               | -0.31962025 |\n",
            "|    std                  | 1.01        |\n",
            "|    value_loss           | 36.6        |\n",
            "-----------------------------------------\n",
            "----------------------------------------\n",
            "| time/                   |            |\n",
            "|    fps                  | 62         |\n",
            "|    iterations           | 10         |\n",
            "|    time_elapsed         | 328        |\n",
            "|    total_timesteps      | 20480      |\n",
            "| train/                  |            |\n",
            "|    approx_kl            | 0.02108417 |\n",
            "|    clip_fraction        | 0.244      |\n",
            "|    clip_range           | 0.2        |\n",
            "|    entropy_loss         | -41.6      |\n",
            "|    explained_variance   | 0.0203     |\n",
            "|    learning_rate        | 0.00025    |\n",
            "|    loss                 | 7.36       |\n",
            "|    n_updates            | 90         |\n",
            "|    policy_gradient_loss | -0.0191    |\n",
            "|    reward               | 0.07936729 |\n",
            "|    std                  | 1.02       |\n",
            "|    value_loss           | 23.3       |\n",
            "----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 62          |\n",
            "|    iterations           | 11          |\n",
            "|    time_elapsed         | 357         |\n",
            "|    total_timesteps      | 22528       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.014700897 |\n",
            "|    clip_fraction        | 0.166       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.6       |\n",
            "|    explained_variance   | 0.00383     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 29.1        |\n",
            "|    n_updates            | 100         |\n",
            "|    policy_gradient_loss | -0.0156     |\n",
            "|    reward               | 1.4870173   |\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 93.3        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 62          |\n",
            "|    iterations           | 12          |\n",
            "|    time_elapsed         | 391         |\n",
            "|    total_timesteps      | 24576       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.017688308 |\n",
            "|    clip_fraction        | 0.194       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.7       |\n",
            "|    explained_variance   | -0.0104     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 6.58        |\n",
            "|    n_updates            | 110         |\n",
            "|    policy_gradient_loss | -0.0161     |\n",
            "|    reward               | -0.7623598  |\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 17.5        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 62          |\n",
            "|    iterations           | 13          |\n",
            "|    time_elapsed         | 422         |\n",
            "|    total_timesteps      | 26624       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.023069832 |\n",
            "|    clip_fraction        | 0.24        |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.7       |\n",
            "|    explained_variance   | 0.0101      |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 38.8        |\n",
            "|    n_updates            | 120         |\n",
            "|    policy_gradient_loss | -0.0147     |\n",
            "|    reward               | 3.4454083   |\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 64.9        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 63          |\n",
            "|    iterations           | 14          |\n",
            "|    time_elapsed         | 454         |\n",
            "|    total_timesteps      | 28672       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.017561657 |\n",
            "|    clip_fraction        | 0.204       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.7       |\n",
            "|    explained_variance   | -0.0172     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 21.4        |\n",
            "|    n_updates            | 130         |\n",
            "|    policy_gradient_loss | -0.02       |\n",
            "|    reward               | 0.9586051   |\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 52.4        |\n",
            "-----------------------------------------\n",
            "day: 3352, episode: 10\n",
            "begin_total_asset: 988584.72\n",
            "end_total_asset: 3416710.65\n",
            "total_reward: 2428125.94\n",
            "total_cost: 420148.99\n",
            "total_trades: 89136\n",
            "Sharpe: 0.598\n",
            "=================================\n",
            "----------------------------------------\n",
            "| time/                   |            |\n",
            "|    fps                  | 63         |\n",
            "|    iterations           | 15         |\n",
            "|    time_elapsed         | 487        |\n",
            "|    total_timesteps      | 30720      |\n",
            "| train/                  |            |\n",
            "|    approx_kl            | 0.02006042 |\n",
            "|    clip_fraction        | 0.219      |\n",
            "|    clip_range           | 0.2        |\n",
            "|    entropy_loss         | -41.7      |\n",
            "|    explained_variance   | -0.0279    |\n",
            "|    learning_rate        | 0.00025    |\n",
            "|    loss                 | 13.3       |\n",
            "|    n_updates            | 140        |\n",
            "|    policy_gradient_loss | -0.0185    |\n",
            "|    reward               | -0.3580386 |\n",
            "|    std                  | 1.02       |\n",
            "|    value_loss           | 23.2       |\n",
            "----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 62          |\n",
            "|    iterations           | 16          |\n",
            "|    time_elapsed         | 523         |\n",
            "|    total_timesteps      | 32768       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.025233287 |\n",
            "|    clip_fraction        | 0.243       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.8       |\n",
            "|    explained_variance   | -0.00552    |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 22.4        |\n",
            "|    n_updates            | 150         |\n",
            "|    policy_gradient_loss | -0.0176     |\n",
            "|    reward               | -0.5090524  |\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 69.5        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 62          |\n",
            "|    iterations           | 17          |\n",
            "|    time_elapsed         | 556         |\n",
            "|    total_timesteps      | 34816       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.022021335 |\n",
            "|    clip_fraction        | 0.216       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.8       |\n",
            "|    explained_variance   | 0.0188      |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 8.75        |\n",
            "|    n_updates            | 160         |\n",
            "|    policy_gradient_loss | -0.0188     |\n",
            "|    reward               | 1.8985721   |\n",
            "|    std                  | 1.03        |\n",
            "|    value_loss           | 23.2        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 62          |\n",
            "|    iterations           | 18          |\n",
            "|    time_elapsed         | 586         |\n",
            "|    total_timesteps      | 36864       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.019396901 |\n",
            "|    clip_fraction        | 0.229       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.9       |\n",
            "|    explained_variance   | 0.00194     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 14.6        |\n",
            "|    n_updates            | 170         |\n",
            "|    policy_gradient_loss | -0.0195     |\n",
            "|    reward               | -0.31956208 |\n",
            "|    std                  | 1.03        |\n",
            "|    value_loss           | 39.6        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 62          |\n",
            "|    iterations           | 19          |\n",
            "|    time_elapsed         | 622         |\n",
            "|    total_timesteps      | 38912       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.020318478 |\n",
            "|    clip_fraction        | 0.225       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.9       |\n",
            "|    explained_variance   | 0.0132      |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 22.3        |\n",
            "|    n_updates            | 180         |\n",
            "|    policy_gradient_loss | -0.0128     |\n",
            "|    reward               | 0.33881456  |\n",
            "|    std                  | 1.03        |\n",
            "|    value_loss           | 55.7        |\n",
            "-----------------------------------------\n",
            "----------------------------------------\n",
            "| time/                   |            |\n",
            "|    fps                  | 62         |\n",
            "|    iterations           | 20         |\n",
            "|    time_elapsed         | 652        |\n",
            "|    total_timesteps      | 40960      |\n",
            "| train/                  |            |\n",
            "|    approx_kl            | 0.02080874 |\n",
            "|    clip_fraction        | 0.179      |\n",
            "|    clip_range           | 0.2        |\n",
            "|    entropy_loss         | -42        |\n",
            "|    explained_variance   | 0.0334     |\n",
            "|    learning_rate        | 0.00025    |\n",
            "|    loss                 | 5.55       |\n",
            "|    n_updates            | 190        |\n",
            "|    policy_gradient_loss | -0.0186    |\n",
            "|    reward               | 0.15585361 |\n",
            "|    std                  | 1.03       |\n",
            "|    value_loss           | 19.1       |\n",
            "----------------------------------------\n",
            "----------------------------------------\n",
            "| time/                   |            |\n",
            "|    fps                  | 62         |\n",
            "|    iterations           | 21         |\n",
            "|    time_elapsed         | 686        |\n",
            "|    total_timesteps      | 43008      |\n",
            "| train/                  |            |\n",
            "|    approx_kl            | 0.01973752 |\n",
            "|    clip_fraction        | 0.227      |\n",
            "|    clip_range           | 0.2        |\n",
            "|    entropy_loss         | -42        |\n",
            "|    explained_variance   | 0.00997    |\n",
            "|    learning_rate        | 0.00025    |\n",
            "|    loss                 | 19         |\n",
            "|    n_updates            | 200        |\n",
            "|    policy_gradient_loss | -0.0153    |\n",
            "|    reward               | -14.07267  |\n",
            "|    std                  | 1.03       |\n",
            "|    value_loss           | 75.2       |\n",
            "----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 62          |\n",
            "|    iterations           | 22          |\n",
            "|    time_elapsed         | 717         |\n",
            "|    total_timesteps      | 45056       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.013898542 |\n",
            "|    clip_fraction        | 0.0931      |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42         |\n",
            "|    explained_variance   | -0.000876   |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 12.2        |\n",
            "|    n_updates            | 210         |\n",
            "|    policy_gradient_loss | -0.0138     |\n",
            "|    reward               | -5.085373   |\n",
            "|    std                  | 1.03        |\n",
            "|    value_loss           | 27          |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 62          |\n",
            "|    iterations           | 23          |\n",
            "|    time_elapsed         | 750         |\n",
            "|    total_timesteps      | 47104       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.01667095  |\n",
            "|    clip_fraction        | 0.185       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42.1       |\n",
            "|    explained_variance   | 0.00379     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 8.83        |\n",
            "|    n_updates            | 220         |\n",
            "|    policy_gradient_loss | -0.0139     |\n",
            "|    reward               | -0.11939671 |\n",
            "|    std                  | 1.03        |\n",
            "|    value_loss           | 30.8        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 62          |\n",
            "|    iterations           | 24          |\n",
            "|    time_elapsed         | 785         |\n",
            "|    total_timesteps      | 49152       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.027711859 |\n",
            "|    clip_fraction        | 0.253       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42.1       |\n",
            "|    explained_variance   | 0.0238      |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 31.9        |\n",
            "|    n_updates            | 230         |\n",
            "|    policy_gradient_loss | -0.00308    |\n",
            "|    reward               | -1.080327   |\n",
            "|    std                  | 1.03        |\n",
            "|    value_loss           | 75          |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 62          |\n",
            "|    iterations           | 25          |\n",
            "|    time_elapsed         | 817         |\n",
            "|    total_timesteps      | 51200       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.025901645 |\n",
            "|    clip_fraction        | 0.278       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42.1       |\n",
            "|    explained_variance   | 0.0481      |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 5.34        |\n",
            "|    n_updates            | 240         |\n",
            "|    policy_gradient_loss | -0.0164     |\n",
            "|    reward               | 0.08477563  |\n",
            "|    std                  | 1.04        |\n",
            "|    value_loss           | 13          |\n",
            "-----------------------------------------\n",
            "{'batch_size': 128, 'buffer_size': 100000, 'learning_rate': 0.0001, 'learning_starts': 100, 'ent_coef': 'auto_0.1'}\n",
            "Using cpu device\n",
            "Logging to results/sac\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 4          |\n",
            "|    fps             | 19         |\n",
            "|    time_elapsed    | 703        |\n",
            "|    total_timesteps | 13412      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 1.68e+03   |\n",
            "|    critic_loss     | 1e+04      |\n",
            "|    ent_coef        | 0.309      |\n",
            "|    ent_coef_loss   | -0.516     |\n",
            "|    learning_rate   | 0.0001     |\n",
            "|    n_updates       | 13311      |\n",
            "|    reward          | -11.183781 |\n",
            "-----------------------------------\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 8          |\n",
            "|    fps             | 19         |\n",
            "|    time_elapsed    | 1410       |\n",
            "|    total_timesteps | 26824      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 677        |\n",
            "|    critic_loss     | 66.2       |\n",
            "|    ent_coef        | 0.0855     |\n",
            "|    ent_coef_loss   | -112       |\n",
            "|    learning_rate   | 0.0001     |\n",
            "|    n_updates       | 26723      |\n",
            "|    reward          | -10.753805 |\n",
            "-----------------------------------\n",
            "day: 3352, episode: 10\n",
            "begin_total_asset: 1005927.23\n",
            "end_total_asset: 5294689.46\n",
            "total_reward: 4288762.22\n",
            "total_cost: 37988.65\n",
            "total_trades: 61507\n",
            "Sharpe: 0.700\n",
            "=================================\n",
            "----------------------------------\n",
            "| time/              |           |\n",
            "|    episodes        | 12        |\n",
            "|    fps             | 18        |\n",
            "|    time_elapsed    | 2126      |\n",
            "|    total_timesteps | 40236     |\n",
            "| train/             |           |\n",
            "|    actor_loss      | 304       |\n",
            "|    critic_loss     | 20.1      |\n",
            "|    ent_coef        | 0.0227    |\n",
            "|    ent_coef_loss   | -145      |\n",
            "|    learning_rate   | 0.0001    |\n",
            "|    n_updates       | 40135     |\n",
            "|    reward          | -9.593834 |\n",
            "----------------------------------\n",
            "{'batch_size': 100, 'buffer_size': 1000000, 'learning_rate': 0.001}\n",
            "Using cpu device\n",
            "Logging to results/td3\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 4          |\n",
            "|    fps             | 24         |\n",
            "|    time_elapsed    | 544        |\n",
            "|    total_timesteps | 13412      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 132        |\n",
            "|    critic_loss     | 6.19e+03   |\n",
            "|    learning_rate   | 0.001      |\n",
            "|    n_updates       | 10059      |\n",
            "|    reward          | -2.3487854 |\n",
            "-----------------------------------\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 8          |\n",
            "|    fps             | 21         |\n",
            "|    time_elapsed    | 1242       |\n",
            "|    total_timesteps | 26824      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 49         |\n",
            "|    critic_loss     | 584        |\n",
            "|    learning_rate   | 0.001      |\n",
            "|    n_updates       | 23471      |\n",
            "|    reward          | -2.3487854 |\n",
            "-----------------------------------\n",
            "day: 3352, episode: 10\n",
            "begin_total_asset: 1012427.98\n",
            "end_total_asset: 5866237.13\n",
            "total_reward: 4853809.15\n",
            "total_cost: 1011.41\n",
            "total_trades: 53632\n",
            "Sharpe: 0.831\n",
            "=================================\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 12         |\n",
            "|    fps             | 20         |\n",
            "|    time_elapsed    | 1943       |\n",
            "|    total_timesteps | 40236      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 37.3       |\n",
            "|    critic_loss     | 101        |\n",
            "|    learning_rate   | 0.001      |\n",
            "|    n_updates       | 36883      |\n",
            "|    reward          | -2.3487854 |\n",
            "-----------------------------------\n",
            "hit end!\n",
            "hit end!\n",
            "hit end!\n",
            "hit end!\n",
            "hit end!\n",
            "[*********************100%***********************]  1 of 1 completed\n",
            "Shape of DataFrame:  (22, 8)\n",
            "i:  3\n",
            "{'n_steps': 5, 'ent_coef': 0.01, 'learning_rate': 0.0007}\n",
            "Using cpu device\n",
            "Logging to results/a2c\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 46        |\n",
            "|    iterations         | 100       |\n",
            "|    time_elapsed       | 10        |\n",
            "|    total_timesteps    | 500       |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.2     |\n",
            "|    explained_variance | -0.471    |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 99        |\n",
            "|    policy_loss        | 86.2      |\n",
            "|    reward             | 1.3343517 |\n",
            "|    std                | 1         |\n",
            "|    value_loss         | 5.99      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 47         |\n",
            "|    iterations         | 200        |\n",
            "|    time_elapsed       | 20         |\n",
            "|    total_timesteps    | 1000       |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.3      |\n",
            "|    explained_variance | -0.271     |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 199        |\n",
            "|    policy_loss        | 44.4       |\n",
            "|    reward             | -1.4969016 |\n",
            "|    std                | 1          |\n",
            "|    value_loss         | 1.22       |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 47         |\n",
            "|    iterations         | 300        |\n",
            "|    time_elapsed       | 31         |\n",
            "|    total_timesteps    | 1500       |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.3      |\n",
            "|    explained_variance | 0.0667     |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 299        |\n",
            "|    policy_loss        | 141        |\n",
            "|    reward             | -4.3429856 |\n",
            "|    std                | 1          |\n",
            "|    value_loss         | 15.6       |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 51         |\n",
            "|    iterations         | 400        |\n",
            "|    time_elapsed       | 39         |\n",
            "|    total_timesteps    | 2000       |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.3      |\n",
            "|    explained_variance | -1.19e-07  |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 399        |\n",
            "|    policy_loss        | 28.9       |\n",
            "|    reward             | -2.9280229 |\n",
            "|    std                | 1.01       |\n",
            "|    value_loss         | 0.941      |\n",
            "--------------------------------------\n",
            "------------------------------------\n",
            "| time/                 |          |\n",
            "|    fps                | 54       |\n",
            "|    iterations         | 500      |\n",
            "|    time_elapsed       | 46       |\n",
            "|    total_timesteps    | 2500     |\n",
            "| train/                |          |\n",
            "|    entropy_loss       | -41.2    |\n",
            "|    explained_variance | 0        |\n",
            "|    learning_rate      | 0.0007   |\n",
            "|    n_updates          | 499      |\n",
            "|    policy_loss        | -356     |\n",
            "|    reward             | 2.440834 |\n",
            "|    std                | 1        |\n",
            "|    value_loss         | 95.7     |\n",
            "------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 52        |\n",
            "|    iterations         | 600       |\n",
            "|    time_elapsed       | 56        |\n",
            "|    total_timesteps    | 3000      |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.3     |\n",
            "|    explained_variance | -1.19e-07 |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 599       |\n",
            "|    policy_loss        | -187      |\n",
            "|    reward             | 7.7011724 |\n",
            "|    std                | 1.01      |\n",
            "|    value_loss         | 30.6      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 55         |\n",
            "|    iterations         | 700        |\n",
            "|    time_elapsed       | 63         |\n",
            "|    total_timesteps    | 3500       |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.4      |\n",
            "|    explained_variance | -0.042     |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 699        |\n",
            "|    policy_loss        | 23.3       |\n",
            "|    reward             | -1.0782235 |\n",
            "|    std                | 1.01       |\n",
            "|    value_loss         | 0.963      |\n",
            "--------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 56          |\n",
            "|    iterations         | 800         |\n",
            "|    time_elapsed       | 71          |\n",
            "|    total_timesteps    | 4000        |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41.4       |\n",
            "|    explained_variance | 0           |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 799         |\n",
            "|    policy_loss        | -258        |\n",
            "|    reward             | -0.20911986 |\n",
            "|    std                | 1.01        |\n",
            "|    value_loss         | 50.4        |\n",
            "---------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 55        |\n",
            "|    iterations         | 900       |\n",
            "|    time_elapsed       | 81        |\n",
            "|    total_timesteps    | 4500      |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.4     |\n",
            "|    explained_variance | 0.118     |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 899       |\n",
            "|    policy_loss        | -66.9     |\n",
            "|    reward             | 0.8433642 |\n",
            "|    std                | 1.01      |\n",
            "|    value_loss         | 2.9       |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 57         |\n",
            "|    iterations         | 1000       |\n",
            "|    time_elapsed       | 87         |\n",
            "|    total_timesteps    | 5000       |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.4      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 999        |\n",
            "|    policy_loss        | 5.19       |\n",
            "|    reward             | -1.4874439 |\n",
            "|    std                | 1.01       |\n",
            "|    value_loss         | 4.01       |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 57        |\n",
            "|    iterations         | 1100      |\n",
            "|    time_elapsed       | 96        |\n",
            "|    total_timesteps    | 5500      |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.4     |\n",
            "|    explained_variance | -0.555    |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 1099      |\n",
            "|    policy_loss        | -77.7     |\n",
            "|    reward             | 1.8939301 |\n",
            "|    std                | 1.01      |\n",
            "|    value_loss         | 3.97      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 57        |\n",
            "|    iterations         | 1200      |\n",
            "|    time_elapsed       | 105       |\n",
            "|    total_timesteps    | 6000      |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.4     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 1199      |\n",
            "|    policy_loss        | 187       |\n",
            "|    reward             | 2.3026025 |\n",
            "|    std                | 1.01      |\n",
            "|    value_loss         | 33.4      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 58        |\n",
            "|    iterations         | 1300      |\n",
            "|    time_elapsed       | 111       |\n",
            "|    total_timesteps    | 6500      |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.4     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 1299      |\n",
            "|    policy_loss        | 47.1      |\n",
            "|    reward             | 1.9173757 |\n",
            "|    std                | 1.01      |\n",
            "|    value_loss         | 5.89      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 54         |\n",
            "|    iterations         | 1400       |\n",
            "|    time_elapsed       | 127        |\n",
            "|    total_timesteps    | 7000       |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.4      |\n",
            "|    explained_variance | -0.0644    |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 1399       |\n",
            "|    policy_loss        | 32.9       |\n",
            "|    reward             | -3.0739012 |\n",
            "|    std                | 1.01       |\n",
            "|    value_loss         | 1.06       |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 55        |\n",
            "|    iterations         | 1500      |\n",
            "|    time_elapsed       | 134       |\n",
            "|    total_timesteps    | 7500      |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.4     |\n",
            "|    explained_variance | -1.19e-07 |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 1499      |\n",
            "|    policy_loss        | 305       |\n",
            "|    reward             | 2.5744946 |\n",
            "|    std                | 1.01      |\n",
            "|    value_loss         | 50.5      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 55         |\n",
            "|    iterations         | 1600       |\n",
            "|    time_elapsed       | 144        |\n",
            "|    total_timesteps    | 8000       |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.4      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 1599       |\n",
            "|    policy_loss        | -4.88      |\n",
            "|    reward             | -2.1707737 |\n",
            "|    std                | 1.01       |\n",
            "|    value_loss         | 1.28       |\n",
            "--------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 55          |\n",
            "|    iterations         | 1700        |\n",
            "|    time_elapsed       | 151         |\n",
            "|    total_timesteps    | 8500        |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41.5       |\n",
            "|    explained_variance | 0           |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 1699        |\n",
            "|    policy_loss        | 160         |\n",
            "|    reward             | -0.81266195 |\n",
            "|    std                | 1.01        |\n",
            "|    value_loss         | 16          |\n",
            "---------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 56         |\n",
            "|    iterations         | 1800       |\n",
            "|    time_elapsed       | 158        |\n",
            "|    total_timesteps    | 9000       |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.5      |\n",
            "|    explained_variance | -0.208     |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 1799       |\n",
            "|    policy_loss        | 38.1       |\n",
            "|    reward             | -1.4904355 |\n",
            "|    std                | 1.01       |\n",
            "|    value_loss         | 5.15       |\n",
            "--------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 56          |\n",
            "|    iterations         | 1900        |\n",
            "|    time_elapsed       | 169         |\n",
            "|    total_timesteps    | 9500        |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41.5       |\n",
            "|    explained_variance | 1.79e-07    |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 1899        |\n",
            "|    policy_loss        | 282         |\n",
            "|    reward             | -0.36043915 |\n",
            "|    std                | 1.01        |\n",
            "|    value_loss         | 56          |\n",
            "---------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 56          |\n",
            "|    iterations         | 2000        |\n",
            "|    time_elapsed       | 175         |\n",
            "|    total_timesteps    | 10000       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41.5       |\n",
            "|    explained_variance | -0.0747     |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 1999        |\n",
            "|    policy_loss        | -471        |\n",
            "|    reward             | -0.37017918 |\n",
            "|    std                | 1.01        |\n",
            "|    value_loss         | 171         |\n",
            "---------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 57         |\n",
            "|    iterations         | 2100       |\n",
            "|    time_elapsed       | 183        |\n",
            "|    total_timesteps    | 10500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.5      |\n",
            "|    explained_variance | 0.163      |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 2099       |\n",
            "|    policy_loss        | 1.74       |\n",
            "|    reward             | -1.2063048 |\n",
            "|    std                | 1.01       |\n",
            "|    value_loss         | 0.28       |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 56        |\n",
            "|    iterations         | 2200      |\n",
            "|    time_elapsed       | 193       |\n",
            "|    total_timesteps    | 11000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.6     |\n",
            "|    explained_variance | -0.326    |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 2199      |\n",
            "|    policy_loss        | -94       |\n",
            "|    reward             | 1.8247845 |\n",
            "|    std                | 1.01      |\n",
            "|    value_loss         | 5.61      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 57        |\n",
            "|    iterations         | 2300      |\n",
            "|    time_elapsed       | 199       |\n",
            "|    total_timesteps    | 11500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.6     |\n",
            "|    explained_variance | 0.0682    |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 2299      |\n",
            "|    policy_loss        | -128      |\n",
            "|    reward             | 0.4869665 |\n",
            "|    std                | 1.01      |\n",
            "|    value_loss         | 15.7      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 57         |\n",
            "|    iterations         | 2400       |\n",
            "|    time_elapsed       | 208        |\n",
            "|    total_timesteps    | 12000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.5      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 2399       |\n",
            "|    policy_loss        | -173       |\n",
            "|    reward             | -0.9164407 |\n",
            "|    std                | 1.01       |\n",
            "|    value_loss         | 23.1       |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 57        |\n",
            "|    iterations         | 2500      |\n",
            "|    time_elapsed       | 217       |\n",
            "|    total_timesteps    | 12500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.6     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 2499      |\n",
            "|    policy_loss        | 86.3      |\n",
            "|    reward             | 3.2540042 |\n",
            "|    std                | 1.02      |\n",
            "|    value_loss         | 5.33      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 58        |\n",
            "|    iterations         | 2600      |\n",
            "|    time_elapsed       | 223       |\n",
            "|    total_timesteps    | 13000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.6     |\n",
            "|    explained_variance | 1.19e-07  |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 2599      |\n",
            "|    policy_loss        | 428       |\n",
            "|    reward             | 2.4169402 |\n",
            "|    std                | 1.02      |\n",
            "|    value_loss         | 112       |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 57         |\n",
            "|    iterations         | 2700       |\n",
            "|    time_elapsed       | 233        |\n",
            "|    total_timesteps    | 13500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.6      |\n",
            "|    explained_variance | 0.0601     |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 2699       |\n",
            "|    policy_loss        | 1.73       |\n",
            "|    reward             | -1.3785244 |\n",
            "|    std                | 1.02       |\n",
            "|    value_loss         | 0.378      |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 58         |\n",
            "|    iterations         | 2800       |\n",
            "|    time_elapsed       | 241        |\n",
            "|    total_timesteps    | 14000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.7      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 2799       |\n",
            "|    policy_loss        | 45.3       |\n",
            "|    reward             | -1.8347946 |\n",
            "|    std                | 1.02       |\n",
            "|    value_loss         | 4.25       |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 58         |\n",
            "|    iterations         | 2900       |\n",
            "|    time_elapsed       | 247        |\n",
            "|    total_timesteps    | 14500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.7      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 2899       |\n",
            "|    policy_loss        | -49.6      |\n",
            "|    reward             | 0.13086061 |\n",
            "|    std                | 1.02       |\n",
            "|    value_loss         | 1.75       |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 58         |\n",
            "|    iterations         | 3000       |\n",
            "|    time_elapsed       | 258        |\n",
            "|    total_timesteps    | 15000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.7      |\n",
            "|    explained_variance | -0.104     |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 2999       |\n",
            "|    policy_loss        | -51.1      |\n",
            "|    reward             | -2.9340496 |\n",
            "|    std                | 1.02       |\n",
            "|    value_loss         | 1.92       |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 58        |\n",
            "|    iterations         | 3100      |\n",
            "|    time_elapsed       | 266       |\n",
            "|    total_timesteps    | 15500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.7     |\n",
            "|    explained_variance | 5.96e-08  |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 3099      |\n",
            "|    policy_loss        | -96       |\n",
            "|    reward             | 5.6104155 |\n",
            "|    std                | 1.02      |\n",
            "|    value_loss         | 7.44      |\n",
            "-------------------------------------\n",
            "------------------------------------\n",
            "| time/                 |          |\n",
            "|    fps                | 58       |\n",
            "|    iterations         | 3200     |\n",
            "|    time_elapsed       | 273      |\n",
            "|    total_timesteps    | 16000    |\n",
            "| train/                |          |\n",
            "|    entropy_loss       | -41.7    |\n",
            "|    explained_variance | 0        |\n",
            "|    learning_rate      | 0.0007   |\n",
            "|    n_updates          | 3199     |\n",
            "|    policy_loss        | 288      |\n",
            "|    reward             | 4.10712  |\n",
            "|    std                | 1.02     |\n",
            "|    value_loss         | 56       |\n",
            "------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 58         |\n",
            "|    iterations         | 3300       |\n",
            "|    time_elapsed       | 283        |\n",
            "|    total_timesteps    | 16500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.7      |\n",
            "|    explained_variance | 5.96e-08   |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 3299       |\n",
            "|    policy_loss        | 29.9       |\n",
            "|    reward             | 0.10846165 |\n",
            "|    std                | 1.02       |\n",
            "|    value_loss         | 6.75       |\n",
            "--------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 58          |\n",
            "|    iterations         | 3400        |\n",
            "|    time_elapsed       | 290         |\n",
            "|    total_timesteps    | 17000       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41.7       |\n",
            "|    explained_variance | 0           |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 3399        |\n",
            "|    policy_loss        | -128        |\n",
            "|    reward             | -0.26822066 |\n",
            "|    std                | 1.02        |\n",
            "|    value_loss         | 12.3        |\n",
            "---------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 58          |\n",
            "|    iterations         | 3500        |\n",
            "|    time_elapsed       | 298         |\n",
            "|    total_timesteps    | 17500       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41.7       |\n",
            "|    explained_variance | 0           |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 3499        |\n",
            "|    policy_loss        | 23.3        |\n",
            "|    reward             | 0.012110213 |\n",
            "|    std                | 1.02        |\n",
            "|    value_loss         | 0.832       |\n",
            "---------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 58         |\n",
            "|    iterations         | 3600       |\n",
            "|    time_elapsed       | 308        |\n",
            "|    total_timesteps    | 18000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.8      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 3599       |\n",
            "|    policy_loss        | -94.5      |\n",
            "|    reward             | -0.6443226 |\n",
            "|    std                | 1.02       |\n",
            "|    value_loss         | 11.1       |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 58        |\n",
            "|    iterations         | 3700      |\n",
            "|    time_elapsed       | 314       |\n",
            "|    total_timesteps    | 18500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.8     |\n",
            "|    explained_variance | 1.19e-07  |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 3699      |\n",
            "|    policy_loss        | -16.7     |\n",
            "|    reward             | 1.8698422 |\n",
            "|    std                | 1.02      |\n",
            "|    value_loss         | 0.374     |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 58         |\n",
            "|    iterations         | 3800       |\n",
            "|    time_elapsed       | 323        |\n",
            "|    total_timesteps    | 19000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.8      |\n",
            "|    explained_variance | 1.19e-07   |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 3799       |\n",
            "|    policy_loss        | 166        |\n",
            "|    reward             | -1.3664656 |\n",
            "|    std                | 1.02       |\n",
            "|    value_loss         | 19.5       |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 58         |\n",
            "|    iterations         | 3900       |\n",
            "|    time_elapsed       | 332        |\n",
            "|    total_timesteps    | 19500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.7      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 3899       |\n",
            "|    policy_loss        | 43.9       |\n",
            "|    reward             | -1.1592114 |\n",
            "|    std                | 1.02       |\n",
            "|    value_loss         | 2.46       |\n",
            "--------------------------------------\n",
            "------------------------------------\n",
            "| time/                 |          |\n",
            "|    fps                | 58       |\n",
            "|    iterations         | 4000     |\n",
            "|    time_elapsed       | 339      |\n",
            "|    total_timesteps    | 20000    |\n",
            "| train/                |          |\n",
            "|    entropy_loss       | -41.8    |\n",
            "|    explained_variance | 0        |\n",
            "|    learning_rate      | 0.0007   |\n",
            "|    n_updates          | 3999     |\n",
            "|    policy_loss        | -31.6    |\n",
            "|    reward             | 1.018338 |\n",
            "|    std                | 1.02     |\n",
            "|    value_loss         | 0.683    |\n",
            "------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 58         |\n",
            "|    iterations         | 4100       |\n",
            "|    time_elapsed       | 348        |\n",
            "|    total_timesteps    | 20500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.8      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 4099       |\n",
            "|    policy_loss        | -21.4      |\n",
            "|    reward             | 0.26098472 |\n",
            "|    std                | 1.02       |\n",
            "|    value_loss         | 0.295      |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 58        |\n",
            "|    iterations         | 4200      |\n",
            "|    time_elapsed       | 356       |\n",
            "|    total_timesteps    | 21000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.8     |\n",
            "|    explained_variance | 5.96e-08  |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 4199      |\n",
            "|    policy_loss        | 37.3      |\n",
            "|    reward             | 2.0496662 |\n",
            "|    std                | 1.02      |\n",
            "|    value_loss         | 1.24      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 4300      |\n",
            "|    time_elapsed       | 362       |\n",
            "|    total_timesteps    | 21500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -41.9     |\n",
            "|    explained_variance | 5.96e-08  |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 4299      |\n",
            "|    policy_loss        | 21.7      |\n",
            "|    reward             | 0.5919729 |\n",
            "|    std                | 1.03      |\n",
            "|    value_loss         | 0.614     |\n",
            "-------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 58          |\n",
            "|    iterations         | 4400        |\n",
            "|    time_elapsed       | 373         |\n",
            "|    total_timesteps    | 22000       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -41.9       |\n",
            "|    explained_variance | 0           |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 4399        |\n",
            "|    policy_loss        | -59.5       |\n",
            "|    reward             | -0.44648832 |\n",
            "|    std                | 1.03        |\n",
            "|    value_loss         | 2.35        |\n",
            "---------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 4500       |\n",
            "|    time_elapsed       | 380        |\n",
            "|    total_timesteps    | 22500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -41.9      |\n",
            "|    explained_variance | -1.19e-07  |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 4499       |\n",
            "|    policy_loss        | 75.7       |\n",
            "|    reward             | -1.7295737 |\n",
            "|    std                | 1.03       |\n",
            "|    value_loss         | 5.7        |\n",
            "--------------------------------------\n",
            "------------------------------------\n",
            "| time/                 |          |\n",
            "|    fps                | 59       |\n",
            "|    iterations         | 4600     |\n",
            "|    time_elapsed       | 387      |\n",
            "|    total_timesteps    | 23000    |\n",
            "| train/                |          |\n",
            "|    entropy_loss       | -41.9    |\n",
            "|    explained_variance | 0        |\n",
            "|    learning_rate      | 0.0007   |\n",
            "|    n_updates          | 4599     |\n",
            "|    policy_loss        | -194     |\n",
            "|    reward             | -2.21535 |\n",
            "|    std                | 1.03     |\n",
            "|    value_loss         | 37.3     |\n",
            "------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 58         |\n",
            "|    iterations         | 4700       |\n",
            "|    time_elapsed       | 398        |\n",
            "|    total_timesteps    | 23500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42        |\n",
            "|    explained_variance | -0.0141    |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 4699       |\n",
            "|    policy_loss        | -32.7      |\n",
            "|    reward             | 0.16243774 |\n",
            "|    std                | 1.03       |\n",
            "|    value_loss         | 1.75       |\n",
            "--------------------------------------\n",
            "------------------------------------\n",
            "| time/                 |          |\n",
            "|    fps                | 59       |\n",
            "|    iterations         | 4800     |\n",
            "|    time_elapsed       | 404      |\n",
            "|    total_timesteps    | 24000    |\n",
            "| train/                |          |\n",
            "|    entropy_loss       | -42      |\n",
            "|    explained_variance | 0.168    |\n",
            "|    learning_rate      | 0.0007   |\n",
            "|    n_updates          | 4799     |\n",
            "|    policy_loss        | -61.7    |\n",
            "|    reward             | 0.961177 |\n",
            "|    std                | 1.03     |\n",
            "|    value_loss         | 2.67     |\n",
            "------------------------------------\n",
            "------------------------------------\n",
            "| time/                 |          |\n",
            "|    fps                | 59       |\n",
            "|    iterations         | 4900     |\n",
            "|    time_elapsed       | 412      |\n",
            "|    total_timesteps    | 24500    |\n",
            "| train/                |          |\n",
            "|    entropy_loss       | -42      |\n",
            "|    explained_variance | 0        |\n",
            "|    learning_rate      | 0.0007   |\n",
            "|    n_updates          | 4899     |\n",
            "|    policy_loss        | -54.1    |\n",
            "|    reward             | 3.000443 |\n",
            "|    std                | 1.03     |\n",
            "|    value_loss         | 2.52     |\n",
            "------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 5000      |\n",
            "|    time_elapsed       | 422       |\n",
            "|    total_timesteps    | 25000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -42       |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 4999      |\n",
            "|    policy_loss        | 75.7      |\n",
            "|    reward             | 0.7883484 |\n",
            "|    std                | 1.03      |\n",
            "|    value_loss         | 6.61      |\n",
            "-------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 59          |\n",
            "|    iterations         | 5100        |\n",
            "|    time_elapsed       | 428         |\n",
            "|    total_timesteps    | 25500       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -42.1       |\n",
            "|    explained_variance | 0           |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 5099        |\n",
            "|    policy_loss        | 237         |\n",
            "|    reward             | -0.49083808 |\n",
            "|    std                | 1.03        |\n",
            "|    value_loss         | 39.1        |\n",
            "---------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 5200      |\n",
            "|    time_elapsed       | 437       |\n",
            "|    total_timesteps    | 26000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -42.1     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 5199      |\n",
            "|    policy_loss        | 152       |\n",
            "|    reward             | 2.7196112 |\n",
            "|    std                | 1.03      |\n",
            "|    value_loss         | 16.1      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 5300       |\n",
            "|    time_elapsed       | 445        |\n",
            "|    total_timesteps    | 26500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.1      |\n",
            "|    explained_variance | -1.19e-07  |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 5299       |\n",
            "|    policy_loss        | -317       |\n",
            "|    reward             | 0.59174556 |\n",
            "|    std                | 1.03       |\n",
            "|    value_loss         | 63.7       |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 5400       |\n",
            "|    time_elapsed       | 452        |\n",
            "|    total_timesteps    | 27000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.1      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 5399       |\n",
            "|    policy_loss        | -126       |\n",
            "|    reward             | 0.06384493 |\n",
            "|    std                | 1.03       |\n",
            "|    value_loss         | 9.43       |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 5500       |\n",
            "|    time_elapsed       | 461        |\n",
            "|    total_timesteps    | 27500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.1      |\n",
            "|    explained_variance | 1.19e-07   |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 5499       |\n",
            "|    policy_loss        | -11.3      |\n",
            "|    reward             | -1.1629822 |\n",
            "|    std                | 1.03       |\n",
            "|    value_loss         | 0.213      |\n",
            "--------------------------------------\n",
            "------------------------------------\n",
            "| time/                 |          |\n",
            "|    fps                | 59       |\n",
            "|    iterations         | 5600     |\n",
            "|    time_elapsed       | 469      |\n",
            "|    total_timesteps    | 28000    |\n",
            "| train/                |          |\n",
            "|    entropy_loss       | -42.1    |\n",
            "|    explained_variance | 0        |\n",
            "|    learning_rate      | 0.0007   |\n",
            "|    n_updates          | 5599     |\n",
            "|    policy_loss        | 91       |\n",
            "|    reward             | 1.35537  |\n",
            "|    std                | 1.03     |\n",
            "|    value_loss         | 5.83     |\n",
            "------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 5700      |\n",
            "|    time_elapsed       | 476       |\n",
            "|    total_timesteps    | 28500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -42.1     |\n",
            "|    explained_variance | 5.96e-08  |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 5699      |\n",
            "|    policy_loss        | -18.6     |\n",
            "|    reward             | -2.177703 |\n",
            "|    std                | 1.03      |\n",
            "|    value_loss         | 0.358     |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 5800       |\n",
            "|    time_elapsed       | 487        |\n",
            "|    total_timesteps    | 29000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42        |\n",
            "|    explained_variance | -1.19e-07  |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 5799       |\n",
            "|    policy_loss        | -36.6      |\n",
            "|    reward             | -2.1937134 |\n",
            "|    std                | 1.03       |\n",
            "|    value_loss         | 2.54       |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 5900       |\n",
            "|    time_elapsed       | 497        |\n",
            "|    total_timesteps    | 29500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42        |\n",
            "|    explained_variance | 1.19e-07   |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 5899       |\n",
            "|    policy_loss        | -94.3      |\n",
            "|    reward             | -1.7350562 |\n",
            "|    std                | 1.03       |\n",
            "|    value_loss         | 7.48       |\n",
            "--------------------------------------\n",
            "day: 3330, episode: 10\n",
            "begin_total_asset: 952508.66\n",
            "end_total_asset: 4088694.53\n",
            "total_reward: 3136185.87\n",
            "total_cost: 3157.22\n",
            "total_trades: 58734\n",
            "Sharpe: 0.733\n",
            "=================================\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 6000      |\n",
            "|    time_elapsed       | 507       |\n",
            "|    total_timesteps    | 30000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -42       |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 5999      |\n",
            "|    policy_loss        | 9.33      |\n",
            "|    reward             | 1.7072018 |\n",
            "|    std                | 1.03      |\n",
            "|    value_loss         | 0.168     |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 6100       |\n",
            "|    time_elapsed       | 515        |\n",
            "|    total_timesteps    | 30500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42        |\n",
            "|    explained_variance | 0.137      |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 6099       |\n",
            "|    policy_loss        | 86.1       |\n",
            "|    reward             | 0.23781453 |\n",
            "|    std                | 1.03       |\n",
            "|    value_loss         | 5.84       |\n",
            "--------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 59          |\n",
            "|    iterations         | 6200        |\n",
            "|    time_elapsed       | 522         |\n",
            "|    total_timesteps    | 31000       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -42         |\n",
            "|    explained_variance | 1.19e-07    |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 6199        |\n",
            "|    policy_loss        | 81.6        |\n",
            "|    reward             | -0.55448675 |\n",
            "|    std                | 1.03        |\n",
            "|    value_loss         | 4.51        |\n",
            "---------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 6300       |\n",
            "|    time_elapsed       | 532        |\n",
            "|    total_timesteps    | 31500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.1      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 6299       |\n",
            "|    policy_loss        | -123       |\n",
            "|    reward             | 0.53070265 |\n",
            "|    std                | 1.03       |\n",
            "|    value_loss         | 10.1       |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 6400       |\n",
            "|    time_elapsed       | 539        |\n",
            "|    total_timesteps    | 32000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.2      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 6399       |\n",
            "|    policy_loss        | -35.1      |\n",
            "|    reward             | -0.7190698 |\n",
            "|    std                | 1.04       |\n",
            "|    value_loss         | 0.746      |\n",
            "--------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 59          |\n",
            "|    iterations         | 6500        |\n",
            "|    time_elapsed       | 547         |\n",
            "|    total_timesteps    | 32500       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -42.1       |\n",
            "|    explained_variance | -1.19e-07   |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 6499        |\n",
            "|    policy_loss        | -195        |\n",
            "|    reward             | -0.20805828 |\n",
            "|    std                | 1.04        |\n",
            "|    value_loss         | 24.3        |\n",
            "---------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 6600      |\n",
            "|    time_elapsed       | 557       |\n",
            "|    total_timesteps    | 33000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -42.1     |\n",
            "|    explained_variance | 0.0285    |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 6599      |\n",
            "|    policy_loss        | -113      |\n",
            "|    reward             | -2.668644 |\n",
            "|    std                | 1.04      |\n",
            "|    value_loss         | 13.9      |\n",
            "-------------------------------------\n",
            "----------------------------------------\n",
            "| time/                 |              |\n",
            "|    fps                | 59           |\n",
            "|    iterations         | 6700         |\n",
            "|    time_elapsed       | 563          |\n",
            "|    total_timesteps    | 33500        |\n",
            "| train/                |              |\n",
            "|    entropy_loss       | -42.2        |\n",
            "|    explained_variance | -0.603       |\n",
            "|    learning_rate      | 0.0007       |\n",
            "|    n_updates          | 6699         |\n",
            "|    policy_loss        | -39.6        |\n",
            "|    reward             | -0.083356254 |\n",
            "|    std                | 1.04         |\n",
            "|    value_loss         | 0.818        |\n",
            "----------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 6800      |\n",
            "|    time_elapsed       | 572       |\n",
            "|    total_timesteps    | 34000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -42.2     |\n",
            "|    explained_variance | 0.0184    |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 6799      |\n",
            "|    policy_loss        | 86.5      |\n",
            "|    reward             | 0.6618178 |\n",
            "|    std                | 1.04      |\n",
            "|    value_loss         | 5.35      |\n",
            "-------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 59          |\n",
            "|    iterations         | 6900        |\n",
            "|    time_elapsed       | 581         |\n",
            "|    total_timesteps    | 34500       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -42.2       |\n",
            "|    explained_variance | 0           |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 6899        |\n",
            "|    policy_loss        | 56.7        |\n",
            "|    reward             | 0.052872755 |\n",
            "|    std                | 1.04        |\n",
            "|    value_loss         | 2.85        |\n",
            "---------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 7000      |\n",
            "|    time_elapsed       | 587       |\n",
            "|    total_timesteps    | 35000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -42.3     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 6999      |\n",
            "|    policy_loss        | 197       |\n",
            "|    reward             | 1.6442178 |\n",
            "|    std                | 1.04      |\n",
            "|    value_loss         | 26.3      |\n",
            "-------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 59          |\n",
            "|    iterations         | 7100        |\n",
            "|    time_elapsed       | 597         |\n",
            "|    total_timesteps    | 35500       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -42.3       |\n",
            "|    explained_variance | -0.0238     |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 7099        |\n",
            "|    policy_loss        | 39.7        |\n",
            "|    reward             | -0.16224274 |\n",
            "|    std                | 1.04        |\n",
            "|    value_loss         | 1.43        |\n",
            "---------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 7200       |\n",
            "|    time_elapsed       | 605        |\n",
            "|    total_timesteps    | 36000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.3      |\n",
            "|    explained_variance | 0.03       |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 7199       |\n",
            "|    policy_loss        | 139        |\n",
            "|    reward             | -0.1674491 |\n",
            "|    std                | 1.04       |\n",
            "|    value_loss         | 11.7       |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 7300      |\n",
            "|    time_elapsed       | 611       |\n",
            "|    total_timesteps    | 36500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -42.4     |\n",
            "|    explained_variance | -0.0288   |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 7299      |\n",
            "|    policy_loss        | -406      |\n",
            "|    reward             | 2.2645469 |\n",
            "|    std                | 1.04      |\n",
            "|    value_loss         | 134       |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 7400       |\n",
            "|    time_elapsed       | 622        |\n",
            "|    total_timesteps    | 37000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.4      |\n",
            "|    explained_variance | 0.0351     |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 7399       |\n",
            "|    policy_loss        | 73.6       |\n",
            "|    reward             | 0.30078474 |\n",
            "|    std                | 1.04       |\n",
            "|    value_loss         | 3.6        |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 7500       |\n",
            "|    time_elapsed       | 629        |\n",
            "|    total_timesteps    | 37500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.3      |\n",
            "|    explained_variance | 5.96e-08   |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 7499       |\n",
            "|    policy_loss        | -8.08      |\n",
            "|    reward             | -0.3665664 |\n",
            "|    std                | 1.04       |\n",
            "|    value_loss         | 0.214      |\n",
            "--------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 59          |\n",
            "|    iterations         | 7600        |\n",
            "|    time_elapsed       | 636         |\n",
            "|    total_timesteps    | 38000       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -42.4       |\n",
            "|    explained_variance | -1.19e-07   |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 7599        |\n",
            "|    policy_loss        | 42.9        |\n",
            "|    reward             | -0.79383886 |\n",
            "|    std                | 1.04        |\n",
            "|    value_loss         | 1.9         |\n",
            "---------------------------------------\n",
            "----------------------------------------\n",
            "| time/                 |              |\n",
            "|    fps                | 59           |\n",
            "|    iterations         | 7700         |\n",
            "|    time_elapsed       | 647          |\n",
            "|    total_timesteps    | 38500        |\n",
            "| train/                |              |\n",
            "|    entropy_loss       | -42.4        |\n",
            "|    explained_variance | -1.19e-07    |\n",
            "|    learning_rate      | 0.0007       |\n",
            "|    n_updates          | 7699         |\n",
            "|    policy_loss        | -104         |\n",
            "|    reward             | -0.073217735 |\n",
            "|    std                | 1.05         |\n",
            "|    value_loss         | 10.3         |\n",
            "----------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 7800       |\n",
            "|    time_elapsed       | 653        |\n",
            "|    total_timesteps    | 39000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.4      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 7799       |\n",
            "|    policy_loss        | -152       |\n",
            "|    reward             | -1.8329335 |\n",
            "|    std                | 1.05       |\n",
            "|    value_loss         | 16.7       |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 7900       |\n",
            "|    time_elapsed       | 661        |\n",
            "|    total_timesteps    | 39500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.4      |\n",
            "|    explained_variance | -1.19e-07  |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 7899       |\n",
            "|    policy_loss        | 144        |\n",
            "|    reward             | -0.8008484 |\n",
            "|    std                | 1.05       |\n",
            "|    value_loss         | 15.8       |\n",
            "--------------------------------------\n",
            "----------------------------------------\n",
            "| time/                 |              |\n",
            "|    fps                | 59           |\n",
            "|    iterations         | 8000         |\n",
            "|    time_elapsed       | 671          |\n",
            "|    total_timesteps    | 40000        |\n",
            "| train/                |              |\n",
            "|    entropy_loss       | -42.3        |\n",
            "|    explained_variance | 5.96e-08     |\n",
            "|    learning_rate      | 0.0007       |\n",
            "|    n_updates          | 7999         |\n",
            "|    policy_loss        | -8.53        |\n",
            "|    reward             | -0.031915538 |\n",
            "|    std                | 1.04         |\n",
            "|    value_loss         | 0.0835       |\n",
            "----------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 8100      |\n",
            "|    time_elapsed       | 677       |\n",
            "|    total_timesteps    | 40500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -42.4     |\n",
            "|    explained_variance | 5.96e-08  |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 8099      |\n",
            "|    policy_loss        | -69.3     |\n",
            "|    reward             | 0.8095603 |\n",
            "|    std                | 1.04      |\n",
            "|    value_loss         | 3.08      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 8200       |\n",
            "|    time_elapsed       | 686        |\n",
            "|    total_timesteps    | 41000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.3      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 8199       |\n",
            "|    policy_loss        | -5.2       |\n",
            "|    reward             | -0.5655167 |\n",
            "|    std                | 1.04       |\n",
            "|    value_loss         | 0.69       |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 8300       |\n",
            "|    time_elapsed       | 695        |\n",
            "|    total_timesteps    | 41500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.4      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 8299       |\n",
            "|    policy_loss        | -29.8      |\n",
            "|    reward             | -1.5929188 |\n",
            "|    std                | 1.04       |\n",
            "|    value_loss         | 0.672      |\n",
            "--------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 59          |\n",
            "|    iterations         | 8400        |\n",
            "|    time_elapsed       | 701         |\n",
            "|    total_timesteps    | 42000       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -42.4       |\n",
            "|    explained_variance | 0           |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 8399        |\n",
            "|    policy_loss        | -54.2       |\n",
            "|    reward             | -0.53150016 |\n",
            "|    std                | 1.05        |\n",
            "|    value_loss         | 9.5         |\n",
            "---------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 8500      |\n",
            "|    time_elapsed       | 711       |\n",
            "|    total_timesteps    | 42500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -42.5     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 8499      |\n",
            "|    policy_loss        | 237       |\n",
            "|    reward             | 2.7706447 |\n",
            "|    std                | 1.05      |\n",
            "|    value_loss         | 42.1      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 8600      |\n",
            "|    time_elapsed       | 719       |\n",
            "|    total_timesteps    | 43000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -42.5     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 8599      |\n",
            "|    policy_loss        | -188      |\n",
            "|    reward             | 1.1153419 |\n",
            "|    std                | 1.05      |\n",
            "|    value_loss         | 21.7      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 8700       |\n",
            "|    time_elapsed       | 730        |\n",
            "|    total_timesteps    | 43500      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.4      |\n",
            "|    explained_variance | -1.19e-07  |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 8699       |\n",
            "|    policy_loss        | -17.4      |\n",
            "|    reward             | -0.5148427 |\n",
            "|    std                | 1.05       |\n",
            "|    value_loss         | 0.297      |\n",
            "--------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 8800       |\n",
            "|    time_elapsed       | 740        |\n",
            "|    total_timesteps    | 44000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.4      |\n",
            "|    explained_variance | 1.19e-07   |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 8799       |\n",
            "|    policy_loss        | 41.4       |\n",
            "|    reward             | 0.32814896 |\n",
            "|    std                | 1.05       |\n",
            "|    value_loss         | 1.6        |\n",
            "--------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 59          |\n",
            "|    iterations         | 8900        |\n",
            "|    time_elapsed       | 746         |\n",
            "|    total_timesteps    | 44500       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -42.4       |\n",
            "|    explained_variance | 0           |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 8899        |\n",
            "|    policy_loss        | -47.2       |\n",
            "|    reward             | -0.17413093 |\n",
            "|    std                | 1.05        |\n",
            "|    value_loss         | 1.75        |\n",
            "---------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 9000       |\n",
            "|    time_elapsed       | 755        |\n",
            "|    total_timesteps    | 45000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.4      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 8999       |\n",
            "|    policy_loss        | 65.1       |\n",
            "|    reward             | 0.38266626 |\n",
            "|    std                | 1.05       |\n",
            "|    value_loss         | 6.64       |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 9100      |\n",
            "|    time_elapsed       | 764       |\n",
            "|    total_timesteps    | 45500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -42.4     |\n",
            "|    explained_variance | -1.19e-07 |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 9099      |\n",
            "|    policy_loss        | 31.2      |\n",
            "|    reward             | 1.3317974 |\n",
            "|    std                | 1.05      |\n",
            "|    value_loss         | 0.927     |\n",
            "-------------------------------------\n",
            "---------------------------------------\n",
            "| time/                 |             |\n",
            "|    fps                | 59          |\n",
            "|    iterations         | 9200        |\n",
            "|    time_elapsed       | 770         |\n",
            "|    total_timesteps    | 46000       |\n",
            "| train/                |             |\n",
            "|    entropy_loss       | -42.4       |\n",
            "|    explained_variance | -0.0927     |\n",
            "|    learning_rate      | 0.0007      |\n",
            "|    n_updates          | 9199        |\n",
            "|    policy_loss        | 181         |\n",
            "|    reward             | -0.49035767 |\n",
            "|    std                | 1.05        |\n",
            "|    value_loss         | 21.3        |\n",
            "---------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 9300      |\n",
            "|    time_elapsed       | 780       |\n",
            "|    total_timesteps    | 46500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -42.5     |\n",
            "|    explained_variance | 1.19e-07  |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 9299      |\n",
            "|    policy_loss        | 148       |\n",
            "|    reward             | -8.756936 |\n",
            "|    std                | 1.05      |\n",
            "|    value_loss         | 30.5      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 9400      |\n",
            "|    time_elapsed       | 788       |\n",
            "|    total_timesteps    | 47000     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -42.5     |\n",
            "|    explained_variance | -0.066    |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 9399      |\n",
            "|    policy_loss        | 40.5      |\n",
            "|    reward             | 0.5117786 |\n",
            "|    std                | 1.05      |\n",
            "|    value_loss         | 1.51      |\n",
            "-------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 9500      |\n",
            "|    time_elapsed       | 794       |\n",
            "|    total_timesteps    | 47500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -42.5     |\n",
            "|    explained_variance | 0         |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 9499      |\n",
            "|    policy_loss        | 46.4      |\n",
            "|    reward             | 1.5631902 |\n",
            "|    std                | 1.05      |\n",
            "|    value_loss         | 1.37      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 9600       |\n",
            "|    time_elapsed       | 804        |\n",
            "|    total_timesteps    | 48000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.5      |\n",
            "|    explained_variance | 5.96e-08   |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 9599       |\n",
            "|    policy_loss        | 4.73       |\n",
            "|    reward             | -0.8106855 |\n",
            "|    std                | 1.05       |\n",
            "|    value_loss         | 0.346      |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 9700      |\n",
            "|    time_elapsed       | 811       |\n",
            "|    total_timesteps    | 48500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -42.5     |\n",
            "|    explained_variance | -1.19e-07 |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 9699      |\n",
            "|    policy_loss        | 60.8      |\n",
            "|    reward             | 1.219504  |\n",
            "|    std                | 1.05      |\n",
            "|    value_loss         | 3.44      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 9800       |\n",
            "|    time_elapsed       | 818        |\n",
            "|    total_timesteps    | 49000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.5      |\n",
            "|    explained_variance | 0.00147    |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 9799       |\n",
            "|    policy_loss        | -19        |\n",
            "|    reward             | 0.36547118 |\n",
            "|    std                | 1.05       |\n",
            "|    value_loss         | 6.68       |\n",
            "--------------------------------------\n",
            "-------------------------------------\n",
            "| time/                 |           |\n",
            "|    fps                | 59        |\n",
            "|    iterations         | 9900      |\n",
            "|    time_elapsed       | 829       |\n",
            "|    total_timesteps    | 49500     |\n",
            "| train/                |           |\n",
            "|    entropy_loss       | -42.5     |\n",
            "|    explained_variance | -0.0611   |\n",
            "|    learning_rate      | 0.0007    |\n",
            "|    n_updates          | 9899      |\n",
            "|    policy_loss        | -14       |\n",
            "|    reward             | 1.2229353 |\n",
            "|    std                | 1.05      |\n",
            "|    value_loss         | 2.29      |\n",
            "-------------------------------------\n",
            "--------------------------------------\n",
            "| time/                 |            |\n",
            "|    fps                | 59         |\n",
            "|    iterations         | 10000      |\n",
            "|    time_elapsed       | 835        |\n",
            "|    total_timesteps    | 50000      |\n",
            "| train/                |            |\n",
            "|    entropy_loss       | -42.6      |\n",
            "|    explained_variance | 0          |\n",
            "|    learning_rate      | 0.0007     |\n",
            "|    n_updates          | 9999       |\n",
            "|    policy_loss        | -15.6      |\n",
            "|    reward             | 0.31784078 |\n",
            "|    std                | 1.05       |\n",
            "|    value_loss         | 0.296      |\n",
            "--------------------------------------\n",
            "{'batch_size': 128, 'buffer_size': 50000, 'learning_rate': 0.001}\n",
            "Using cpu device\n",
            "Logging to results/ddpg\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 4          |\n",
            "|    fps             | 23         |\n",
            "|    time_elapsed    | 556        |\n",
            "|    total_timesteps | 13324      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 20.3       |\n",
            "|    critic_loss     | 66.8       |\n",
            "|    learning_rate   | 0.001      |\n",
            "|    n_updates       | 9993       |\n",
            "|    reward          | -4.5011277 |\n",
            "-----------------------------------\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 8          |\n",
            "|    fps             | 21         |\n",
            "|    time_elapsed    | 1250       |\n",
            "|    total_timesteps | 26648      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 2.62       |\n",
            "|    critic_loss     | 9.79       |\n",
            "|    learning_rate   | 0.001      |\n",
            "|    n_updates       | 23317      |\n",
            "|    reward          | -4.5011277 |\n",
            "-----------------------------------\n",
            "day: 3330, episode: 10\n",
            "begin_total_asset: 965326.95\n",
            "end_total_asset: 3940368.63\n",
            "total_reward: 2975041.68\n",
            "total_cost: 964.36\n",
            "total_trades: 53280\n",
            "Sharpe: 0.657\n",
            "=================================\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 12         |\n",
            "|    fps             | 20         |\n",
            "|    time_elapsed    | 1944       |\n",
            "|    total_timesteps | 39972      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | -3.63      |\n",
            "|    critic_loss     | 2.39       |\n",
            "|    learning_rate   | 0.001      |\n",
            "|    n_updates       | 36641      |\n",
            "|    reward          | -4.5011277 |\n",
            "-----------------------------------\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 16         |\n",
            "|    fps             | 20         |\n",
            "|    time_elapsed    | 2656       |\n",
            "|    total_timesteps | 53296      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | -6.92      |\n",
            "|    critic_loss     | 1.51       |\n",
            "|    learning_rate   | 0.001      |\n",
            "|    n_updates       | 49965      |\n",
            "|    reward          | -4.5011277 |\n",
            "-----------------------------------\n",
            "{'n_steps': 2048, 'ent_coef': 0.01, 'learning_rate': 0.00025, 'batch_size': 128}\n",
            "Using cpu device\n",
            "Logging to results/ppo\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    fps             | 70         |\n",
            "|    iterations      | 1          |\n",
            "|    time_elapsed    | 29         |\n",
            "|    total_timesteps | 2048       |\n",
            "| train/             |            |\n",
            "|    reward          | -0.3290882 |\n",
            "-----------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 67          |\n",
            "|    iterations           | 2           |\n",
            "|    time_elapsed         | 60          |\n",
            "|    total_timesteps      | 4096        |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.019916927 |\n",
            "|    clip_fraction        | 0.207       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.2       |\n",
            "|    explained_variance   | -0.00611    |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 6.42        |\n",
            "|    n_updates            | 10          |\n",
            "|    policy_gradient_loss | -0.0268     |\n",
            "|    reward               | 0.84259444  |\n",
            "|    std                  | 1           |\n",
            "|    value_loss           | 15          |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 65          |\n",
            "|    iterations           | 3           |\n",
            "|    time_elapsed         | 93          |\n",
            "|    total_timesteps      | 6144        |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.016416349 |\n",
            "|    clip_fraction        | 0.211       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.3       |\n",
            "|    explained_variance   | 0.00243     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 71.4        |\n",
            "|    n_updates            | 20          |\n",
            "|    policy_gradient_loss | -0.0189     |\n",
            "|    reward               | -22.102169  |\n",
            "|    std                  | 1.01        |\n",
            "|    value_loss           | 95.2        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 65          |\n",
            "|    iterations           | 4           |\n",
            "|    time_elapsed         | 125         |\n",
            "|    total_timesteps      | 8192        |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.016711425 |\n",
            "|    clip_fraction        | 0.152       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.3       |\n",
            "|    explained_variance   | -0.0235     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 19.2        |\n",
            "|    n_updates            | 30          |\n",
            "|    policy_gradient_loss | -0.0181     |\n",
            "|    reward               | 0.8641611   |\n",
            "|    std                  | 1.01        |\n",
            "|    value_loss           | 51          |\n",
            "-----------------------------------------\n",
            "----------------------------------------\n",
            "| time/                   |            |\n",
            "|    fps                  | 64         |\n",
            "|    iterations           | 5          |\n",
            "|    time_elapsed         | 158        |\n",
            "|    total_timesteps      | 10240      |\n",
            "| train/                  |            |\n",
            "|    approx_kl            | 0.02179965 |\n",
            "|    clip_fraction        | 0.258      |\n",
            "|    clip_range           | 0.2        |\n",
            "|    entropy_loss         | -41.3      |\n",
            "|    explained_variance   | -0.00376   |\n",
            "|    learning_rate        | 0.00025    |\n",
            "|    loss                 | 24.8       |\n",
            "|    n_updates            | 40         |\n",
            "|    policy_gradient_loss | -0.0161    |\n",
            "|    reward               | 0.7124557  |\n",
            "|    std                  | 1.01       |\n",
            "|    value_loss           | 37.7       |\n",
            "----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 6           |\n",
            "|    time_elapsed         | 189         |\n",
            "|    total_timesteps      | 12288       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.020254686 |\n",
            "|    clip_fraction        | 0.206       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.4       |\n",
            "|    explained_variance   | -0.02       |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 15.9        |\n",
            "|    n_updates            | 50          |\n",
            "|    policy_gradient_loss | -0.0192     |\n",
            "|    reward               | 2.9676142   |\n",
            "|    std                  | 1.01        |\n",
            "|    value_loss           | 56          |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 7           |\n",
            "|    time_elapsed         | 221         |\n",
            "|    total_timesteps      | 14336       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.015349641 |\n",
            "|    clip_fraction        | 0.182       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.5       |\n",
            "|    explained_variance   | 0.00714     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 7.18        |\n",
            "|    n_updates            | 60          |\n",
            "|    policy_gradient_loss | -0.0222     |\n",
            "|    reward               | -1.0227845  |\n",
            "|    std                  | 1.01        |\n",
            "|    value_loss           | 12.5        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 8           |\n",
            "|    time_elapsed         | 254         |\n",
            "|    total_timesteps      | 16384       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.020761559 |\n",
            "|    clip_fraction        | 0.231       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.5       |\n",
            "|    explained_variance   | -0.00857    |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 25.2        |\n",
            "|    n_updates            | 70          |\n",
            "|    policy_gradient_loss | -0.0199     |\n",
            "|    reward               | 0.80425155  |\n",
            "|    std                  | 1.01        |\n",
            "|    value_loss           | 57.8        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 9           |\n",
            "|    time_elapsed         | 283         |\n",
            "|    total_timesteps      | 18432       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.018122694 |\n",
            "|    clip_fraction        | 0.236       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.5       |\n",
            "|    explained_variance   | 0.00296     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 28.1        |\n",
            "|    n_updates            | 80          |\n",
            "|    policy_gradient_loss | -0.0166     |\n",
            "|    reward               | -1.42386    |\n",
            "|    std                  | 1.01        |\n",
            "|    value_loss           | 57.8        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 10          |\n",
            "|    time_elapsed         | 318         |\n",
            "|    total_timesteps      | 20480       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.022673171 |\n",
            "|    clip_fraction        | 0.205       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.6       |\n",
            "|    explained_variance   | -0.013      |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 17.3        |\n",
            "|    n_updates            | 90          |\n",
            "|    policy_gradient_loss | -0.0191     |\n",
            "|    reward               | 0.8197509   |\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 44.3        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 63          |\n",
            "|    iterations           | 11          |\n",
            "|    time_elapsed         | 352         |\n",
            "|    total_timesteps      | 22528       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.020850785 |\n",
            "|    clip_fraction        | 0.214       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.6       |\n",
            "|    explained_variance   | -0.00669    |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 48.4        |\n",
            "|    n_updates            | 100         |\n",
            "|    policy_gradient_loss | -0.0161     |\n",
            "|    reward               | 1.2033767   |\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 99.1        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 63          |\n",
            "|    iterations           | 12          |\n",
            "|    time_elapsed         | 384         |\n",
            "|    total_timesteps      | 24576       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.024814304 |\n",
            "|    clip_fraction        | 0.251       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.6       |\n",
            "|    explained_variance   | -0.0225     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 10.8        |\n",
            "|    n_updates            | 110         |\n",
            "|    policy_gradient_loss | -0.018      |\n",
            "|    reward               | 1.610058    |\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 22.5        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 63          |\n",
            "|    iterations           | 13          |\n",
            "|    time_elapsed         | 416         |\n",
            "|    total_timesteps      | 26624       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.017855735 |\n",
            "|    clip_fraction        | 0.173       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.7       |\n",
            "|    explained_variance   | 0.00501     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 34.5        |\n",
            "|    n_updates            | 120         |\n",
            "|    policy_gradient_loss | -0.0189     |\n",
            "|    reward               | 7.162905    |\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 112         |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 14          |\n",
            "|    time_elapsed         | 446         |\n",
            "|    total_timesteps      | 28672       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.018644353 |\n",
            "|    clip_fraction        | 0.153       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.7       |\n",
            "|    explained_variance   | 0.0117      |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 16.7        |\n",
            "|    n_updates            | 130         |\n",
            "|    policy_gradient_loss | -0.0172     |\n",
            "|    reward               | 2.0473788   |\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 53.3        |\n",
            "-----------------------------------------\n",
            "day: 3330, episode: 10\n",
            "begin_total_asset: 994554.41\n",
            "end_total_asset: 4699503.39\n",
            "total_reward: 3704948.98\n",
            "total_cost: 439274.68\n",
            "total_trades: 90096\n",
            "Sharpe: 0.806\n",
            "=================================\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 63          |\n",
            "|    iterations           | 15          |\n",
            "|    time_elapsed         | 480         |\n",
            "|    total_timesteps      | 30720       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.02508668  |\n",
            "|    clip_fraction        | 0.25        |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.8       |\n",
            "|    explained_variance   | -0.0505     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 4.42        |\n",
            "|    n_updates            | 140         |\n",
            "|    policy_gradient_loss | -0.0173     |\n",
            "|    reward               | -0.36127353 |\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 14.8        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 16          |\n",
            "|    time_elapsed         | 510         |\n",
            "|    total_timesteps      | 32768       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.021448491 |\n",
            "|    clip_fraction        | 0.211       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.8       |\n",
            "|    explained_variance   | 0.00132     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 38          |\n",
            "|    n_updates            | 150         |\n",
            "|    policy_gradient_loss | -0.00894    |\n",
            "|    reward               | -2.4289682  |\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 88          |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 17          |\n",
            "|    time_elapsed         | 542         |\n",
            "|    total_timesteps      | 34816       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.02103462  |\n",
            "|    clip_fraction        | 0.208       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.8       |\n",
            "|    explained_variance   | -0.0246     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 35.3        |\n",
            "|    n_updates            | 160         |\n",
            "|    policy_gradient_loss | -0.0134     |\n",
            "|    reward               | -0.71985894 |\n",
            "|    std                  | 1.02        |\n",
            "|    value_loss           | 54.5        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 63          |\n",
            "|    iterations           | 18          |\n",
            "|    time_elapsed         | 577         |\n",
            "|    total_timesteps      | 36864       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.022089712 |\n",
            "|    clip_fraction        | 0.213       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.9       |\n",
            "|    explained_variance   | -0.0028     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 27.9        |\n",
            "|    n_updates            | 170         |\n",
            "|    policy_gradient_loss | -0.0207     |\n",
            "|    reward               | 0.11034006  |\n",
            "|    std                  | 1.03        |\n",
            "|    value_loss           | 39.3        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 63          |\n",
            "|    iterations           | 19          |\n",
            "|    time_elapsed         | 609         |\n",
            "|    total_timesteps      | 38912       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.014264661 |\n",
            "|    clip_fraction        | 0.126       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -41.9       |\n",
            "|    explained_variance   | -0.00283    |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 58.6        |\n",
            "|    n_updates            | 180         |\n",
            "|    policy_gradient_loss | -0.0135     |\n",
            "|    reward               | 6.176509    |\n",
            "|    std                  | 1.03        |\n",
            "|    value_loss           | 119         |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 63          |\n",
            "|    iterations           | 20          |\n",
            "|    time_elapsed         | 642         |\n",
            "|    total_timesteps      | 40960       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.027180977 |\n",
            "|    clip_fraction        | 0.292       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42         |\n",
            "|    explained_variance   | 0.0421      |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 8.9         |\n",
            "|    n_updates            | 190         |\n",
            "|    policy_gradient_loss | -0.0156     |\n",
            "|    reward               | 0.20096779  |\n",
            "|    std                  | 1.03        |\n",
            "|    value_loss           | 19.8        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 21          |\n",
            "|    time_elapsed         | 671         |\n",
            "|    total_timesteps      | 43008       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.021884244 |\n",
            "|    clip_fraction        | 0.205       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42         |\n",
            "|    explained_variance   | -0.00219    |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 53.4        |\n",
            "|    n_updates            | 200         |\n",
            "|    policy_gradient_loss | -0.0145     |\n",
            "|    reward               | -0.839949   |\n",
            "|    std                  | 1.03        |\n",
            "|    value_loss           | 94.3        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 63          |\n",
            "|    iterations           | 22          |\n",
            "|    time_elapsed         | 706         |\n",
            "|    total_timesteps      | 45056       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.024635753 |\n",
            "|    clip_fraction        | 0.235       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42.1       |\n",
            "|    explained_variance   | -0.00329    |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 27.6        |\n",
            "|    n_updates            | 210         |\n",
            "|    policy_gradient_loss | -0.0148     |\n",
            "|    reward               | -0.21918707 |\n",
            "|    std                  | 1.04        |\n",
            "|    value_loss           | 61.8        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 64          |\n",
            "|    iterations           | 23          |\n",
            "|    time_elapsed         | 735         |\n",
            "|    total_timesteps      | 47104       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.038902897 |\n",
            "|    clip_fraction        | 0.28        |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42.2       |\n",
            "|    explained_variance   | -0.0241     |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 21.9        |\n",
            "|    n_updates            | 220         |\n",
            "|    policy_gradient_loss | -0.0178     |\n",
            "|    reward               | -0.12725857 |\n",
            "|    std                  | 1.04        |\n",
            "|    value_loss           | 34.3        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 63          |\n",
            "|    iterations           | 24          |\n",
            "|    time_elapsed         | 768         |\n",
            "|    total_timesteps      | 49152       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.017998032 |\n",
            "|    clip_fraction        | 0.174       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42.2       |\n",
            "|    explained_variance   | 0.0111      |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 27.9        |\n",
            "|    n_updates            | 230         |\n",
            "|    policy_gradient_loss | -0.0148     |\n",
            "|    reward               | 1.7231001   |\n",
            "|    std                  | 1.04        |\n",
            "|    value_loss           | 65.2        |\n",
            "-----------------------------------------\n",
            "-----------------------------------------\n",
            "| time/                   |             |\n",
            "|    fps                  | 63          |\n",
            "|    iterations           | 25          |\n",
            "|    time_elapsed         | 804         |\n",
            "|    total_timesteps      | 51200       |\n",
            "| train/                  |             |\n",
            "|    approx_kl            | 0.017844416 |\n",
            "|    clip_fraction        | 0.186       |\n",
            "|    clip_range           | 0.2         |\n",
            "|    entropy_loss         | -42.3       |\n",
            "|    explained_variance   | 0.0211      |\n",
            "|    learning_rate        | 0.00025     |\n",
            "|    loss                 | 13.5        |\n",
            "|    n_updates            | 240         |\n",
            "|    policy_gradient_loss | -0.0149     |\n",
            "|    reward               | -1.0208522  |\n",
            "|    std                  | 1.04        |\n",
            "|    value_loss           | 35.2        |\n",
            "-----------------------------------------\n",
            "{'batch_size': 128, 'buffer_size': 100000, 'learning_rate': 0.0001, 'learning_starts': 100, 'ent_coef': 'auto_0.1'}\n",
            "Using cpu device\n",
            "Logging to results/sac\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 4          |\n",
            "|    fps             | 18         |\n",
            "|    time_elapsed    | 704        |\n",
            "|    total_timesteps | 13324      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 1.1e+03    |\n",
            "|    critic_loss     | 642        |\n",
            "|    ent_coef        | 0.169      |\n",
            "|    ent_coef_loss   | -83.1      |\n",
            "|    learning_rate   | 0.0001     |\n",
            "|    n_updates       | 13223      |\n",
            "|    reward          | -4.2128644 |\n",
            "-----------------------------------\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 8          |\n",
            "|    fps             | 18         |\n",
            "|    time_elapsed    | 1433       |\n",
            "|    total_timesteps | 26648      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 451        |\n",
            "|    critic_loss     | 27.5       |\n",
            "|    ent_coef        | 0.046      |\n",
            "|    ent_coef_loss   | -109       |\n",
            "|    learning_rate   | 0.0001     |\n",
            "|    n_updates       | 26547      |\n",
            "|    reward          | -4.2404695 |\n",
            "-----------------------------------\n",
            "day: 3330, episode: 10\n",
            "begin_total_asset: 953106.81\n",
            "end_total_asset: 7458866.64\n",
            "total_reward: 6505759.83\n",
            "total_cost: 8648.15\n",
            "total_trades: 59083\n",
            "Sharpe: 0.842\n",
            "=================================\n",
            "----------------------------------\n",
            "| time/              |           |\n",
            "|    episodes        | 12        |\n",
            "|    fps             | 18        |\n",
            "|    time_elapsed    | 2152      |\n",
            "|    total_timesteps | 39972     |\n",
            "| train/             |           |\n",
            "|    actor_loss      | 216       |\n",
            "|    critic_loss     | 38.6      |\n",
            "|    ent_coef        | 0.0127    |\n",
            "|    ent_coef_loss   | -102      |\n",
            "|    learning_rate   | 0.0001    |\n",
            "|    n_updates       | 39871     |\n",
            "|    reward          | -3.931381 |\n",
            "----------------------------------\n",
            "{'batch_size': 100, 'buffer_size': 1000000, 'learning_rate': 0.001}\n",
            "Using cpu device\n",
            "Logging to results/td3\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 4          |\n",
            "|    fps             | 25         |\n",
            "|    time_elapsed    | 526        |\n",
            "|    total_timesteps | 13324      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 91.6       |\n",
            "|    critic_loss     | 1.45e+03   |\n",
            "|    learning_rate   | 0.001      |\n",
            "|    n_updates       | 9993       |\n",
            "|    reward          | -3.5290053 |\n",
            "-----------------------------------\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 8          |\n",
            "|    fps             | 22         |\n",
            "|    time_elapsed    | 1191       |\n",
            "|    total_timesteps | 26648      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 43.8       |\n",
            "|    critic_loss     | 317        |\n",
            "|    learning_rate   | 0.001      |\n",
            "|    n_updates       | 23317      |\n",
            "|    reward          | -3.5290053 |\n",
            "-----------------------------------\n",
            "day: 3330, episode: 10\n",
            "begin_total_asset: 972865.93\n",
            "end_total_asset: 3563567.55\n",
            "total_reward: 2590701.62\n",
            "total_cost: 971.89\n",
            "total_trades: 46620\n",
            "Sharpe: 0.648\n",
            "=================================\n",
            "-----------------------------------\n",
            "| time/              |            |\n",
            "|    episodes        | 12         |\n",
            "|    fps             | 21         |\n",
            "|    time_elapsed    | 1862       |\n",
            "|    total_timesteps | 39972      |\n",
            "| train/             |            |\n",
            "|    actor_loss      | 34.3       |\n",
            "|    critic_loss     | 54.4       |\n",
            "|    learning_rate   | 0.001      |\n",
            "|    n_updates       | 36641      |\n",
            "|    reward          | -3.5290053 |\n",
            "-----------------------------------\n"
          ]
        }
      ],
      "source": [
        "train_start_date = \"2009-01-01\"\n",
        "train_end_date = \"2022-07-01\"\n",
        "trade_start_date = \"2022-07-01\"\n",
        "trade_end_date = \"2022-11-01\"\n",
        "rolling_window_length = 22  # num of trading days in a rolling window\n",
        "if_store_actions = True\n",
        "if_store_result = True\n",
        "if_using_a2c = True\n",
        "if_using_ddpg = True\n",
        "if_using_ppo = True\n",
        "if_using_sac = True\n",
        "if_using_td3 = True\n",
        "stock_trading_rolling_window(\n",
        "    train_start_date,\n",
        "    train_end_date,\n",
        "    trade_start_date,\n",
        "    trade_end_date,\n",
        "    rolling_window_length,\n",
        "    if_store_actions=if_store_actions,\n",
        "    if_using_a2c=if_using_a2c,\n",
        "    if_store_result=if_store_result,\n",
        "    if_using_ddpg=if_using_ddpg,\n",
        "    if_using_ppo=if_using_ppo,\n",
        "    if_using_sac=if_using_sac,\n",
        "    if_using_td3=if_using_td3,\n",
        ")\n",
        "\n"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "collapsed_sections": [
        "HMNR5nHjh1iz",
        "uijiWgkuh1jB",
        "MRiOtrywfAo1",
        "_gDkU-j-fCmZ",
        "3Zpv4S0-fDBv"
      ],
      "provenance": []
    },
    "kernelspec": {
      "display_name": "base",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.8.5 (default, Sep  4 2020, 02:22:02) \n[Clang 10.0.0 ]"
    },
    "vscode": {
      "interpreter": {
        "hash": "54cefccbf0f07c9750f12aa115c023dfa5ed4acecf9e7ad3bc9391869be60d0c"
      }
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
